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BACKGROUND OF THE INVENTION 



1. Field of the Invention 

The present invention relates generally to the fields of molecular and 
5 developmental biology. More particularly, it concerns methods and compositions 
involving s-SHIP promoter regions, which can be used to promote transcription in 
particular cell types and at particular times during development. 

2. Description of Related Art 

Stem cells have been the focus of tremendous interest in recent years because of 
10 the progress made in developmental and molecular biology and the promise of 
therapeutic applications in a wide variety of contexts, from heart disease to diabetes, and 
cancer to Parkinson's disease (see generally Abbott et al, 2003; Daley, 2003; Hirai, 
2002; Kondo et al, 2003; Nakano, 2003). Toward fulfilling this promise, many 
researchers have engaged in extensive studies to characterize factors and pathways in 
15 stem cell development and to evaluate candidate therapeutic and diagnostic agents. Such 
agents include proteins that are gene products, sometimes heterologous, in the stem cells. 
The ability to express a transgene in stem cells is critical for providing data toward these 
endeavors. The study of genes normally expressed in stem cells has yielded not only 
information regarding the developmental, cellular, and molecular biology of these cells, 
20 but also useful tools for further studies. 

Pathways involved in stem cell function include the protein phosphatidylinositol 
3-kinase (PI3K), which becomes activated through cell surface receptors. PI3K is 
involved in the generation of phosphatidylinositol 3, 4, 5 -triphosphate, which activates 
signaling pathways leading to cell proliferation. The SH2-containing inositol 5'- 
25 phosphatase (SHIP1) removes the phosphate group from the D5 position of 
phosphatidylinositols, which is considered an significant feedback mechanism on cell 
activation for hematopoietic cells (Lioubin et al, 1996; Rohrschneider et al, 2000). 

A form of SHIP1 lacking the SH2 domain has been identified and referred to as 
stem or short SHIP (s-SHIP) (Tu et al, 2001). Tu et al. found the protein contains amino 
30 acids encoded by exons 6-27 of SHIP1 and that it is expressed in embryonic and 
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hematopoietic stem cells. It was initially unclear how s-SHIP protein was produced from 
the shipl gene. Kavanaugh et al. (1996) suggested that SIP-110 was a spliced product of 
SHEP1; however, Tu et al. (2001) proposed, based on its cDNA sequence, that it was 
transcribed from a promoter within the SHIP] gene. This was inferred from the fact that 
5 the first 44 nucleotides of the s-SHEP cDNA were at the 5' end immediately before exon 
6 of SHIPL These 44 nucleotides were not contained in the SHIP1 cDNA, but were 
identical to the 44 nucleotides of genomic ship] intron 5, immediately adjacent to exon 6. 
However, no functional evidence for an s-ship promoter was provided. Therefore, while a 
promoter with the tissue-specific expression of s-ship could be valuable from both a 
10 research and therapeutic/diagnostic perspective, further investigation of the s-ship gene 
was required to identify the promoter and characterize any tissue-specific activity. 

SUMMARY OF THE INVENTION 

The present invention concerns methods and compositions involving a functional 
15 s-ship promoter. The invention includes nucleic acid molecules, host cells, and 
transgenic organisms having an s-ship promoter, as well as methods of using the 
promoter for transcription, expression studies, stem cell analyses, and therapeutic 
applications. 

The present invention concerns an s-ship promoter and its functional derivatives. 

20 The term "promoter" is used according to its ordinary and plain meaning to a person 
skilled in the art of eukaryotic transcriptional regulation. The terms "s-ship promoter" or 
"s-ship] promoter" refer to the nucleic acid sequence from the s-ship gene that is capable 
of promoting transcription of a nucleic acid sequence that is connected to it 
(downstream). Transcription can be assayed according to any number of ways known to 

25 those of skill in the art, including, but not limited to, an expression assay using a 
screenable or selectable marker; ribonuclease protection assay (RNAP), RT-PCR, and in 
vitro transcription reactions, all of which are well known to those of skill in the art and 
can be implemented using commercially available reagents and protocols (see generally, 
Sambrook et al, 1989; Ausubel, 1992 and 1994, all of which are incorporated by 

30 reference). 
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Compositions of the invention include isolated polynucleotides comprising an s- 
ship promoter capable of promoting transcription. SEQ ID NO:l is a 102 kb genomic 
mouse shipl sequence. In certain embodiments, the s-shipl promoter comprises, or has at 
least or at most 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 12, 130, 140, 150, 160, 170, 
5 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 
360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 
540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 
720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 
900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000, 1100, 1200, 1300, 1400, 1500, 

10 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 
3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 
4400, 4500, 4600, 4700, 4800, 4900, 5000, 6000, 7000, 8000, 9000, 10000, 11000, 
12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, 20000, 21000, 22000, 23000, 
24000, 25000, 26000, 27000, 28000, 29000, 30000, or more contiguous nucleotides of 

15 SEQ ID NO: 1 , or any range derivable therein. In specific embodiments, s-shipl promoter 
includes one or more of the following regions of SEQ ID NO:l: from nucleotide (nt) 
49485 to 60914 (11.5 kb-GFP construct); 49485 to 57072, which is 7588 nt (7.6 kb 
construct); from nt 49485 to 55810, which is 6326 nt (6.3 kb construct); from 49485 to 
54755, but lacking 57050 to 57883 (6.2kb-GFP construct); from nt 51389 to 55810, 

20 which is 4421 nt (4.4 kb construct); from nt 52199 to 56423, which is 4224 nt (4.2 kb 
construct); from nt 53820 to 55810, which is 1990 nt (1.9 kb construct); from nt 54755 to 
55810, which is 1055 nt (0.96 kb construct); or from nt 55668 to 55810, which is 142 nt 
(44 nt construct). It is further contemplated that the lengths of contiguous nucleotides 
discussed above can be applied with respect to these identified regions of SEQ ID NO:l, 

25 as well as any other sequence disclosed herein. 

It is contemplated that functional derivatives of the s-ship promoter also 
contemplated by the invention. Functional derivatives of an s-ship promoter may be at 
least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identical to the polynucleotides 
discussed herein. Such derivatives may also be characterized by any of the lengths of 
30 contiguous nucleotides discussed above. 
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In specific embodiments, the s-ship promoter is capable of promoting tissue- 
specific transcription. Transcription may be accomplished, in some embodiments of the 
invention, in skin, a hair follicle, cornea, embryo, gonads, mammary gland, pancreas, 
and/or smooth muscle. It is also contemplated that transcription may be achieved in cells 
5 qualified as or with characteristics of stem cells, which may or may not be derived from 
skin, a hair follicle, cornea, embryo, gonads, mammary gland, pancreas, and/or smooth 
muscle. In some embodiments, transcription is achieved in a hematopoietic cell in a 
tissue-specific manner, including in hematopoietic stem or progenitor cells, but also in 
more mature or differentiated cells. 

10 In some embodiments of the invention, an s-ship promoter is operably attached to 

a heterologous nucleic acid. The term "heterologous" is used according to its plain and 
ordinary meaning to a person skilled in the art of molecular biology. It is a relative term 
and in the context of an s-ship promoter, it refers to a nucleic acid sequence that is not 
normally found in nature (with respect to sequence and position) with the s-ship 

15 promoter. In other words, it refers to any nucleic acid that is not the entire genomic 
sequence of the s-ship gene. In some embodiments, the s-ship promoter is connected to a 
nucleic acid sequence encoding part of the s-ship gene product or all or part of an s-shipl 
cDNA sequence. Alternatively, the s-ship promoter may be placed at a location different 
than is found in nature. 

20 Because recombinant cells and transgenic animals, including knockout versions 

thereof, are part of the invention, the present invention further encompasses nucleic acids 
containing an s-ship gene or a portion thereof and a marker sequence, wherein the s-ship 
gene is disrupted by the marker sequence. In some embodiments, the nucleic acid is 
under the control of a promoter, which is an s-ship promoter in further embodiments. The 

25 promoter may also be constitutive, inducible, or conditional. Promoters discussed herein 
may be tissue-specific (spatially restricted), developmental-specific (providing 
transcription at specific developmental stages or times), and/or temporally restricted. 

The present invention further concerns expression cassettes, vectors, and host 
cells that contain or include polynucleotides having an s-ship promoter that has been 

30 isolated away from its chromosomal context. The polynucleotides and embodiments 
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discussed above may be implemented with respect to expression cassettes, vectors, and 
host cells. 

It is contemplated that the s-ship promoter may control the transcription of a 
nucleic acid sequence encoding a marker. In some embodiments, the marker is 
5 colorimetric, enzymatic, or fluorescent. Examples include, but are not limited to, 0- 
galactosidase, chloramphenicol acetylase, luciferase, and green fluorescent protein. In 
further embodiments, a heterologous nucleic acid segment encodes a therapeutic or 
diagnostic gene product. The therapeutic or diagnostic gene product may be a protein or 
RNA molecule, such as an siRNA or miRNA molecule. In some embodiments, the 

10 therapeutic gene product is selected from the group consisting of a tumor suppressor, a 
cytokine, a cytokine receptor, a differentiation-inducer, growth factor, and a growth 
factor receptor. Examples of such proteins are well known to those of skill in the art, and 
include, but are not limited to, interleukins (IL-2, -6, -8, -9, -10, -11, -12, -13, -14, -15, - 
16, -17, -18, -19, -20, -21, -22, -23, -24, etc.), interferons, receptor tyrosine kinases and 

15 their ligands (kit/steel, CSFR/CSF, GM-CSFR/GM-CSF, PDGFR/PDGF, flk-l/VEGF, 
Lif, EGF, FGF, etc.), transforming growth factors a and P, Epo, IGF, tumor necrosis 
factor a and p. A number of examples can be seen on the world wide web at 
indstate.edu/thcme/mwking/growth-factors.html, which is specifically incorporated by 
reference. 

20 In some embodiments of the invention, a vector is a plasmid, YAC, BAC, or 

virus. Viruses include adenovirus, adeno-associated virus, retrovirus, flavivirus, and 
vaccinia virus. 

Compositions of the invention may be prepared in a pharmaceutically or 
pharmacologically acceptable formulation. Such formulations are well known to those of 
25 skill in the art for use in in vivo contexts. 

Other aspects of the invention include host cell having an s-ship promoter 
operably attached to a heterologous nucleic acid segment. In some embodiments, the host 
cell is eukaryotic, though it may be prokaryotic. In specific embodiments, the host cell is 
from a mammal, insect, bacteria, or yeast. Cells from monkeys, mice, rats, rabbits, 
30 hamsters, ferrets, and humans are specifically contemplated for use with nucleic acids of 
the invention. In some cases, the host cell is an embryonic cell, which may specifically be 
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a blastocyst cell. In other cases, the host cell is a stem or progenitor cell. In some cases, 
the cell is a hematopoietic cell, meaning any cell in that lineage. It is contemplated that 
the cell may be in vitro or in vivo. 

Cells that can be used according to methods and compositions of the invention 
5 include, but are not limited to, CD34+ cells (cells expressing CD34 on their surface), 
undifferentiated cells, stem cells, progenitor cells, cord blood cells, placental cells, 
neonatal or fetal cells, immature cells, pluripotent cells, and totipotent cells. The term 
"stem cell" is used according to its ordinary meaning, for example, as described by the 
National Institutes of Health (on the World Wide Web at stemcells.nih.gov). Stem cells 

10 1) are "capable of dividing and renewing themselves for long periods"; 2) are 
unspecialized; and, 3) can give rise to specialized cell types. 

The invention specifically contemplates the use of embryonic stem cells, adult 
stem cells, or neonatal and fetal stem cells. An adult stem cell typically refers to a stem 
cell from a particular organ or tissue that is capable of differentiating into one or more 

15 cells of that organ or tissue. Umbilical cord blood contains stem cells that are similar to 
embryonic stem cells in that they are believed to be capable of being differentiated into a 
number of different cell types, as opposed to cell types of one particular organ or tissue. 
Umbilical cord blood refers to blood that remains in the umbilical cord and placenta 
following birth and after the cord is cut. "Placental blood" is understood to be 

20 synonymous with cord blood; similarly, cord blood stem cell is considered synonymous 
with placental or placental blood stem cell. The use of stem cells from umbilical cord 
blood is specifically contemplated in certain embodiments of the invention. In some but 
not all cases, the use of other stem cells is specifically not considered part of the 
invention, particularly the use of pancreatic/endocrine progenitor or stem cells is not 

25 considered for use with some embodiments. Furthermore, cells of the invention may be 
characterized by cell surface antigens. Cell surface antigens and their correlation with cell 
type and cell development are known to those of ordinary skill in the art. 

It will be understood that cultures or samples containing cells discussed above are 
also contemplated for use according to methods and compositions of the invention. 

30 Further embodiments of the invention include cells for use in the generation of 

transgenic organisms (knock-in and knock-out). Accordingly, there are recombinant host 
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cells in which one or both s-ship genes is disrupted by marker sequence or in which all or 
part of an s-ship gene is flanked by an excisable sequence, such as a loxP sequence. The 
marker sequence serves the purpose of showing when the transgenic sequence is present 
or absent in the cell. 

5 The present invention further concerns transgenic animals comprising an s-ship 

promoter operably attached to a heterologous nucleic acid segment. Mammals are 
specifically contemplated, particularly mice. In some cases, the invention involves a 
mammal having cells comprising an s-ship transgenic sequence. The s-ship sequence 
may be knocked in or out in a restricted or controlled manner. For example, whether it is 

10 knocked in or out may be controlled in a tissue-specific, inducible, conditional, 
developmental or temporal manner. Consequently, animals may have heterologous genes 
under the control of a promoter or system that operates in that way. The Cre-Lox system 
is one example. The transgene of interest itself may not be under the control of a limited 
promoter, but a secondary gene whose product initiates the knock-in or knock-out 

15 process may be under such a promoter. In one embodiments, animals of the invention 
may have an s-ship transgenic sequence that includes an s-ship coding sequence flanked 
by loxP sequences. They may also have a heterologous nucleic acid sequence encoding a 
Cre recombinase. In some cases, the nucleic acid sequence encoding the Cre recombinase 
is under the control of an inducible or conditional promoter. Transgenic animals of the 

20 invention are not limited by the Cre-Lox system, which serves as an example of how 
expression may be controlled. 

A number of methods are included as part of the present invention. In some 
embodiments, there are methods for expressing a recombinant nucleic acid in a cell 
comprising:a) transfecting the cell with an expression cassette comprising an s-ship 

25 promoter operably attached to the recombinant nucleic acid, wherein the nucleic acid is 
transcribed. The cell may be any of the host cells discussed above. 

Other embodiments of the invention concern methods of screening for a candidate 
substance that regulates activity of the s-ship promoter comprising a step selected from 
the group consisting of: (a) contacting a nucleic acid comprising an s-ship promoter with 

30 an s-ship promoter binding protein and the candidate substance under conditions that 
allow binding between the protein and the promoter and determining whether the 
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candidate compound modulates the binding between the protein and the promoter; and 
(b) contacting the candidate substance with a cell comprising the s-ship promoter 
operably attached to a reporter gene coding for an expression product and assaying for 
expression of the reporter gene expression product. One or both steps may be employed. 
5 Ways of determining whether the candidate compound modulates binding between a 
protein and the promoter are well known to those of skill in the art. The compound may 
inhibit, reduce, decrease, eliminate, increase, promote, tighten the binding between the 
protein and the promoter. Assays for such an interaction include, but are not limited to, 
electrophoretic mobility shift assays (EMSA), DNA footprinting, functional transcription 

10 assays — as described above — Southwestern assays, and PCR-based assays. 

The present invention also includes methods for identifying stem cells in a 
population of cells comprising: (a) administering to cells in the population a nucleic acid 
comprising an s-ship promoter operably attached to a reporter or marker gene. The 
reporter or marker gene is then used to identify positively-expressing cells, which would 

15 indicate the cell is a stem cell. The cell may be in an organ and/or in an animal. In some 
embodiments, methods include sorting cells based on expression of the reporter or 
marker gene. In addition to the assays discussed above, FACS analysis may be employed, 
in addition to other cell sorting techniques. Methods include differentiation of the cells. 

Aspects of the invention also concern methods for screening for a modulator of 

20 cell function comprising: a) transfecting a stem or hematopoietic cell with an expression 
cassette comprising an s-ship promoter operably attached to a nucleic acid encoding a 
candidate modulator; and, b) assaying the cell for a cell function, wherein a difference in 
cell function in the cell as compared to a cell in the absence of the candidate modulator is 
indicative of a modulator. The term "modulator" refers to a substance that affects cell 

25 function. It may affect cell function by acting on or through a pathway. The modulator 
may inhibit, reduce, eliminate, decrease, increase, promote, induce, or enhance a 
particular cell function or result of a pathway in the cell. It is contemplated that this 
method may be employed to identify a modulator as a candidate therapeutic agent for the 
treatment of a blood-related disease or condition. 

30 Therapeutic methods are also provided by the present invention. Methods are not 

necessarily limited to a particular disease or condition. It is contemplated that any method 
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in which expression in stem cells or cells in which the s-ship promoter can function are 
contemplated for use in therapeutic methods of the invention. For example, the method 
may be applied to pancreatic disorders and diseases. 

Thus, in some embodiments, there is a method of treating a patient with a blood- 
5 related disease or condition comprising: a) transfecting a cell with an expression cassette 
comprising an s-ship promoter region operably attached to a therapeutic nucleic acid; 
and, b) administering the cell to the patient. Blood-related disease or condition include 
blood-related cancers — such as leukemia, lymphoma, or myeloma — and anemia. In some 
cases, the blood-related condition can be treated using stem cell replacement therapy. 
10 Cells for therapeutic use may, in addition to the cells discussed above, be bone 

marrow cells, or be autologous or allogeneic. 

It is contemplated that any embodiment discussed with respect to any method or 
composition described herein can be implemented with respect to any other method or 
composition described herein. 

15 The use of the term "or" in the claims is used to mean "and/or" unless explicitly 

indicated to refer to alternatives only or the alternatives are mutually exclusive, although 
the disclosure supports a definition that refers to only alternatives and "and/or." 

Throughout this application, the term "about" is used to indicate that a value 
includes the standard deviation of error for the device or method being employed to 
20 determine the value. 

Following long-standing patent law convention, the words "a" and "an," when 
used in conjunction with the word "comprising" in the claims or specification, denotes 
one or more, unless specifically noted. 

Other objects, features and advantages of the present invention will become 
25 apparent from the following detailed description. It should be understood, however, that 
the detailed description and the specific examples, while indicating specific embodiments 
of the invention, are given by way of illustration only, since various changes and 
modifications within the spirit and scope of the invention will become apparent to those 
skilled in the art from this detailed description. 

30 
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BRIEF DESCRIPTION OF THE DRAWINGS 



The following drawings form part of the present specification and are included to 
further demonstrate certain aspects of the present invention. The invention may be better 
5 understood by reference to one or more of these drawings in combination with the 
detailed description of specific embodiments presented herein. 

FIG 1. shipl genomic segments cloned into a promoter-less expression vector for 
testing cell-specific promoter activity. The upper line represents the general shipl 
genomic region containing potential activity for cell-specific s-SHIP expression. Intron 5 
10 contains the likely promoter activity and transcription is proposed to begin before exon 6. 
The 44-intronic nucleotides, contained in the s-SHIP cDNA, are shown as red. A 7.6 kb 
genomic fragment (second line down), as well as the indicated sub-fragments, were 
cloned into a promoter-less plasmid for GFP expression. The design and construction of 
the plasmid is detailed in Materials and Methods. 

15 FIG. 2. Flow cytometry analysis of cell type-specific promoter activity in D3 ES 

cells vs. NIH3T3 cells. Each construct shown was cloned into a promoter-less GFP 
plasmid, which was linearized and electroporated into D3 ES cells, or transfected into 
NIH3T3 cells. G418 resistant colonies were then examined by flow cytometry for GFP 
expression. Two different "empty vector" negative controls were utilized depending on 

20 whether the insert contained a splice acceptor or both splice acceptor and donor. Both 
these plasmids without genomic insert were negative for GFP expression in both cell 
types, but only a single negative-control vector is shown. Two positive-control plasmids 
were utilized in each experiment. These controls expressed GFP from an IRES, and one 
expressed a protein insert, both were positive in each cell type. 

25 FIG. 3A-B. Structure of the 11.5-kb and 6.2-kb transgenic promoter-GFP 

constructs for in vivo analysis. FIG. 3A. Two promoter transgenic constructs were 
prepared. The first construct is called the 11.5kb-GFP transgene, and contains the entire 
genomic shipl segment from the Sac I site near the 5' end of intron 5 through the 
putative translation start site at an ATG preceded by a suitable Kozac sequence within 

30 exon 7. The translational start ATG for the enhanced GFP is fused, in frame, to the likely 
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ATG translational start for s-SHIP. A second transgenic construct, called the 6.2kb-GFP 
transgene, is identical to the 11.5kb-GFP construct, except it contains only 0.96 kb 
upstream of exon 6, and lacks 833 nt within intron 6. FIG. 3B. Transgenic copy numbers 
were estimated by semi-quantitative PCR analysis relative to the endogenous diploid 
5 gab2 gene. 

FIG. 4. Computer analysis of 600 nucleotides of the intron-5 transgene promoter 
region. A. The region immediately upstream of exon 6 is shown with potential 
transcription factor binding motifs determined by analysis using the Matlnspector 
program. Only the factors with matrix and core similarity greater than 0.9 are shown. 
10 Those factor motifs within the strand shown are over-lined, while those factors 
potentially interacting with the complementary strand are shown underlined. The SSR or 
stem-SHIP region identified by Tu et al., 2001 is in bold, and an initiator sequence for 
transcription is situated at the beginning of the SSR. Exon 6 (not shown) begins at the 3' 
end of the SSR. 

15 FIG. 5. The two primary proteins, s-SHIP and SHDP1, are produced from the 

shipl gene. The domain structure of the two proteins is shown above the ship! genomic 
intron/exon organization. Transcription for the full-length 145 kDa SHIP1 protein 
initiates in promoter 1 (Prol), utilizes all 27 exons, and translation begins in the exon 1 
encoded region. The stop codon is the first three nucleotides of exon 27. Transcription 

20 for the s-SHIP protein begins within intron 5 (Pro 2), and downstream is identical to the 
SHIP1 product. Translation, however, presumably begins in the first ATG of exon 7. 
Both transcripts and protein sequences are therefore identical from the ATG in exon 7 
through the stop codon in exon 27. The dashed lines indicate translation start and stop 
points for each protein within the genomic exons. 

25 DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 

The present invention is based on the isolation and characterization of the s-shipl 
promoter, which can be used to promote transcription. Methods and compositions 
involving the s-shipl promoter are provided herein. In some embodiments, they take 
advantage of the tissue specificity of s-shipl expression, s-shipl encodes a protein whose 
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expression has been observed in limited cell populations, and thus, the tissue-specificity 
of its promoter can be exploited in a number of different ways. 

I. SHIP1 and s-SHIP Background 

The s-shipl promoter was studied because of the function and expression patterns 
5 for the s-shipl (also referred to as s-ship) and ship! gene products. The murine SHIP1 
protein is encoded in 27 exons of the Inpp5d (inositol polyphosphate-5-phosphatase D) 
locus, spanning approximately 102 kbps on chromosome 1 at position 57.0 cM of the 
genetic map, or cytoband C5 of the cytogenetic map (reviewed in Rohrschneider et al, 
2000; Wolf et al, 2001; NCBI databases). The full-length protein is 145 kDa, but 

10 splicing, involving exons 25 and 26, can produce 4 additional proteins ranging in size 
from 109-135 kDa (Lucas and Rohrschneider, 1999; Wolf et al, 2000). These splicing 
reactions affect the 350-amino acid C-terminal tail region and its numerous protein 
interaction motifs, such as those binding PTB, SH2, and SH3 domains. 

The prominent structural features of the SHIP1 protein dictate its major functional 

15 interactions. The SHIP1 SH2 domain has general specificity for tyrosine-phosphorylated 
Yxx(L/I/V) amino acid motifs, and its inositol 5 'phosphatase domain removes phosphate 
from the 5' position of inositol(3,4,5)P3, phosphatidylinositol(3,4,5)P3 or 
Inositol(l,3,4,5)P 4 [see Sly et al, (2003) for review]. The tyrosine-phosphorylated C- 
terminal tail interacts directly with the PTB domain of She and Dok proteins (Lioubin et 

20 al , 1 996; Sattler et al , 200 1 ; Tamir et al , 2000), and a potential interaction motif for the 
SH2 domain of the p85 component of the p85/PI3K is present in the full-length SHIP1 
(Gupta et al, 1999; Lucas and Rohrschneider 1999), but eliminated by the splicing events 
producing the a and p isoforms (Rohrschneider et al, 2000). Polyproline-rich interaction 
motifs for the SH3 domains of Grb2 also are present in the C-tail region (Kavanaugh et 

25 al, 1996). The SHIP1 proteins {e.g., the 145 kDa protein and isoforms thereof) are 
expressed in hematopoietic cells and testes, with lower expression observed in a few 
other adult tissues (Q. Liu et al, 1998, reviewed in Rohrschneider, 2003). 

Functionally, both biochemical and genetic experiments indicate SHIP1 is a 
negative regulator of myeloid cell proliferation, survival, and perhaps chemotaxis (see 

30 Sly et al, 2003; Rohrschneider, 2003). Also, SHIP1 negatively regulates degranulation, 
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inflammatory cytokine release, and adhesion for mast cells, and SHIP1 is a component of 
negative signaling (anergy) in B cell proliferation. The molecular mechanisms for most 
of these effects require the attachment of the SHIP1 SH2 domain to the cytoplasmic 
portions of transmembrane receptors containing appropriate tyrosine-phosphorylated 
5 interaction motifs. There, the SHIP 1 inositol-5 'phosphatase domain converts the plasma 
membrane PI3K-produced substrate, phosphatidylinositol(3,4,5)P3 to 
phosphatidylinositol(3,4)P2 effectively terminating proliferation signals. Therefore, the 
SH2 domain of SHIP1 plays a critical role in initiating many of these negative biological 
effects. 

10 An additional smaller protein from the shipl locus is described as an SH2-less 

104 kDa protein (Tu et al, 2001). This product is called s-SHIP, with the prefix 
signifying its only known expression within two stem cell types (i.e., ES cells and 
lineage-depleted Seal -positive cells of the bone marrow). This protein was first described 
by Kavanaugh et al. (1996) and called SIP-110 in the human; but details of its existence 

15 were unclear until Tu et al. (2001) defined the cDNA and demonstrated endogenous 
expression in the two cell types described above. Thus, structurally, s-SHIP differs from 
SHIP1 only by the lack of the N-terminal SH2 domain; but biochemically, s-SHIP also 
lacks tyrosine phosphorylation and association with She (Kavanaugh et al., 1996; Tu et 
al., 2001). Nevertheless, s-SHIP constitutively interacts with Grb2. The lack of an SH2 

20 domain in s-SHIP indicates its interaction mechanism with target proteins probably 
differs from that of SHIP 1; however, the biological functions of s-SHIP are not known. 

II. Nucleic Acids 

A. Polynucleotides 

The s-shipl promoter was identified as a strong promoter for s-SHIP by analyses 
25 of the genomic shipl intron-5 region in driving GFP expression both in vitro and in vivo. 
This promoter exhibited cell-type specific expression in ES cells, and mice transgenic for 
the promoter (the 11.5kb-GFP transgene) showed tissue-specific GFP expression within 
the inner cell mass of the blastocyst. Transgenic mice produced with a shorter promoter 
construct (the 6.2kb-GFP transgene) expressed GFP throughout the blastocyst, suggesting 
30 the absence of negative regulatory regions in the shorted transgene. RT-PCR analysis 
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demonstrated s-SHIP expression within the blastocyst. These results indicate that the 
11.5-kb promoter region of the transgene contains the information for tissue-specific 
expression of s-SHIP, as well as tissue-specific shut-off of this protein. It is specifically 
contemplated that this promoter and the transgenic mice will be useful for future 
5 examination of GFP-expression in potential stem/progenitor cells of the embryo and the 
adult mouse. 

The present invention concerns polynucleotides, isolatable from cells, that are free 
from total genomic DNA and that contain an s-ship promoter. It is contemplated that the 
s-shipl promoter is capable of directing transcription of nucleic acid sequence. 

10 Transcription may be directed in a tissue-specific or developmental manner. The nucleic 
acid sequence may encode a peptide or polypeptide, or it may also encode an RNA 
molecule that is not translated into a protein. 

A "promoter" is a control sequence that is a region of a nucleic acid sequence at 
which initiation and rate of transcription are controlled. It may contain genetic elements 

15 at which regulatory proteins and molecules may bind, such as RNA polymerase and 
transcription factors. The phrases "operatively positioned," "operatively linked," "under 
control," "operatively attached," and "under transcriptional control" mean that a promoter 
is in a correct functional location and/or orientation in relation to a nucleic acid sequence 
to control transcriptional initiation and/or expression of that sequence. Typically, the 

20 promoter is located 5' or upstream from the strand of sequence to be transcribed. A 
promoter may or may not be used in conjunction with an "enhancer," which refers to a 
cis-acting regulatory nucleic acid sequence involved in the transcriptional activation of a 
nucleic acid sequence. 

As used herein, the term "DNA segment" or "nucleic acid segment" refers to a 

25 DNA or nucleic acid molecule that has been isolated free of total genomic DNA of a 
particular species. Therefore, a DNA segment encoding a polypeptide refers to a DNA 
segment that contains wild-type, polymorphic, or mutant polypeptide-coding sequences 
yet is isolated away from, or purified free from, total mammalian or human genomic 
DNA. Included within the term "DNA segment" are a polypeptide or polypeptides, DNA 

30 segments smaller than a polypeptide, and recombinant vectors, including, for example, 
plasmids, cosmids, phage, viruses, and the like. 
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As used in this application, the term "s-ship polynucleotide" refers to an s-ship- 
encoding nucleic acid molecule. The term "cDNA" is intended to refer to DNA prepared 
using messenger RNA (mRNA) as template. 

It also is contemplated that a particular polypeptide from a given species may be 
5 represented by natural variants that have slightly different nucleic acid sequences but, 
nonetheless, encode the same protein. 

Similarly, a polynucleotide comprising an isolated or purified wild-type, 
polymorphic, or mutant polypeptide gene refers to a DNA segment including wild-type, 
polymorphic, or mutant polypeptide coding sequences isolated substantially away from 

10 other naturally occurring genes or protein encoding sequences. In this respect, the term 
"gene" is used for simplicity to refer to a functional protein, polypeptide, or peptide- 
encoding unit. As will be understood by those in the art, this functional term includes 
genomic sequences, cDNA sequences, and smaller engineered gene segments that 
express, or may be adapted to express, proteins, polypeptides, domains, peptides, fusion 

15 proteins, and mutants. A nucleic acid encoding all or part of a native or modified 
polypeptide may contain a contiguous nucleic acid sequence encoding all or a portion of 
such a polypeptide of the following lengths: about 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 
1 10, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 
290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 441, 450, 

20 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 
640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 
820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 
1000, 1010, 1020, 1030, 1040, 1050, 1060, 1070, 1080, 1090, 1095, 1100, 1500, 2000, 
2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 9000, 10000, 

25 or more nucleotides, nucleosides, or base pairs. 

In particular embodiments, the invention concerns isolated DNA segments and 
recombinant vectors incorporating an s-ship promoter with a heterologous nucleic acid 
sequence or a ship or s-ship cDNA segment. Thus, an isolated DNA segment or vector 
containing a DNA segment may encode, for example, the heterologous nucleic acid 

30 sequence. The term "recombinant" may be used in conjunction with a polypeptide, the 
name of a specific polypeptide, a nucleic acid sequence, or a host cell, and this generally 
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means that the entity involves or involved a nucleic acid molecule that was manipulated 
in vitro using recombinant DNA technology. 

The nucleic acid segments used in the present invention, regardless of the length 
of the coding sequence itself, may be combined with other nucleic acid sequences, such 
5 as enhancers, polyadenylation signals, additional restriction enzyme sites, multiple 
cloning sites, other coding segments, and the like, such that their overall length may vary 
considerably. It is therefore contemplated that a nucleic acid fragment of almost any 
length may be employed, with the total length preferably being limited by the ease of 
preparation and use in the intended recombinant DNA protocol. 

10 It is contemplated that the nucleic acid constructs of the present invention may 

encode full-length polypeptide from any source or encode a truncated version of the 
polypeptide, such that the transcript of the coding region represents the truncated version. 
The truncated transcript may then be translated into a truncated protein. Alternatively, a 
nucleic acid sequence may encode a full-length polypeptide sequence with additional 

15 heterologous coding sequences, for example to allow for purification of the polypeptide, 
transport, secretion, post-translational modification, or for therapeutic benefits such as 
targetting or efficacy. As discussed above, a tag or other heterologous polypeptide may 
be added to the modified polypeptide-encoding sequence, wherein "heterologous" refers 
to a polypeptide that is not the same as the modified polypeptide. 

20 In a non-limiting example, one or more nucleic acid constructs may be prepared 

that include a contiguous stretch of nucleotides of sequences disclosed herein, including 
the s-ship promoter. 

A nucleic acid construct may be at least 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 
120, 130, 140, 150, 160, 170, 180, 190, 200, 250, 300, 400, 500, 600, 700, 800, 900, 

25 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 15,000, 20,000, 
30,000, 50,000, 100,000, 250,000, 500,000, 750,000, to at least 1,000,000 nucleotides in 
length, as well as constructs of greater size, up to and including chromosomal sizes 
(including all intermediate lengths and intermediate ranges), given the advent of nucleic 
acids constructs such as a yeast artificial chromosome are known to those of ordinary 

30 skill in the art. It will be readily understood that "intermediate lengths" and 
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"intermediate ranges," as used herein, means any length or range including or between 
the quoted values (i.e., all integers including and between such values). 

It is specifically contemplated that nucleic acids of the invention may include, be 
at most, or be at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,21,22, 23,24, 
5 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 
73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 
97, 98, 99, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 
250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 

10 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 
610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 
790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 
970, 980, 990, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 
2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 

15 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800, 4900, 
5000, 5100, 5200, 5300, 5400, 5500, 5600, 5700, 5800, 5900, 6000, 6100, 6200, 6300, 
6400, 6500, 6600, 6700, 6800, 6900, 7000, 7100, 7200, 7300, 7400, 7500, 7600, 7700, 
7800, 7900, 8000, 8100, 8200, 8300, 8400, 8500, 8600, 8700, 8800, 8900, 9000, 9100, 
9200, 9300, 9400, 9500, 9600, 9700, 9800, 9900, 10000, 10100, 10200, 10300, 10400, 

20 10500, 10600, 10700, 10800, 10900, 11000, 11100, 11200, 11300, 11400, 11500, 11600, 
11700, 11800, 11900, 12000 or more contiguous nucleotides (or any range derivable 
therein) of nucleic acid disclosed in this application, including, but not limited to SEQ ID 
NO: 1, intron 5 of the mouse s-ship gene, an s-ship promoter, and any other SEQ ID NOs. 
The various probes and primers designed around the nucleotide sequences of the 

25 present invention may be of any length. By assigning numeric values to a sequence, for 
example, the first residue is 1, the second residue is 2, etc., an algorithm defining all primers 
can be proposed: 

n to n + y 

where n is an integer from 1 to the last number of the sequence and y is the length of the 
30 primer minus one, where n + y does not exceed the last number of the sequence. Thus, for a 
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10-mer, the probes correspond to bases 1 to 10, 2 to 1 1, 3 to 12 ... and so on. For a 15-mer, 
the probes correspond to bases 1 to 15, 2 to 16, 3 to 17 ... and so on. For a 20-mer, the 
probes correspond to bases 1 to 20, 2 to 21, 3 to 22 ... and so on. 

It also will be understood that this invention is not limited to the particular nucleic 
5 acid sequences of SEQ ID NO:l. Recombinant vectors and isolated DNA segments may 
therefore variously include coding regions, coding regions bearing selected alterations or 
modifications in the basic coding region, or they may encode biologically functional 
equivalent sequences. For example, mutations can be made to SEQ ID NO:l that 
potentially enhance or alter function relative to the native sequence or alternatively, may 

10 be silent with regard to function. 

The s-ship promoter sequence of the invention is exemplified by the nucleic acid 
sequence given in SEQ ID NO:l. However, in addition to the unmodified s-ship 
promoter sequence of SEQ ID NO:l, the current invention includes derivatives of this 
sequence and compositions made therefrom. In particular, the present disclosure 

15 provides the teaching for one of skill in the art to make and use derivatives of the s-ship 
promoter. For example, the disclosure provides the teaching for one of skill in the art to 
delimit the functional elements within the s-ship promoter and to delete any non-essential 
elements. Functional elements also could be modified to increase the utility of the 
sequences of the invention for any particular application. For example, a functional 

20 region within the s-ship promoter of the invention could be modified to cause or increase 
tissue-specific expression. Such changes could be made by site-specific mutagenesis 
techniques, for example, as described below. 

One efficient means for preparing such derivatives comprises introducing 
mutations into the sequences of the invention, for example, the sequence given in SEQ ID 

25 NO:l. Such mutants may potentially have enhanced or altered function relative to the 
native sequence or alternatively, may be silent with regard to function. 

Mutagenesis may be carried out at random and the mutagenized sequences 
screened for function in a trial-by-error procedure. Alternatively, particular sequences 
that provide the s-ship promoter with desirable expression characteristics could be 

30 identified and these or similar sequences introduced into other related or non-related 
sequences via mutation. Similarly, non-essential elements may be deleted without 
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significantly altering the function of the elements. It further is contemplated that one 
could mutagenize these sequences in order to enhance their utility in expressing 
transgenes in a particular cell type, for example, in a particular stem cell. 

The means for mutagenizing a DNA segment containing an s-ship promoter 
5 sequence of the current invention are well-known to those of skill in the art. Mutagenesis 
may be performed in accordance with any of the techniques known in the art, such as, but 
not limited to, synthesizing an oligonucleotide having one or more mutations within the 
sequence of a particular regulatory region. In particular, site-specific mutagenesis is a 
technique useful in the preparation of promoter mutants, through specific mutagenesis of 

10 the underlying DNA. The technique further provides a ready ability to prepare and test 
sequence variants, for example, incorporating one or more of the foregoing 
considerations, by introducing one or more nucleotide sequence changes into the DNA. 
Site-specific mutagenesis allows the production of mutants through the use of specific 
oligonucleotide sequences which encode the DNA sequence of the desired mutation, as 

15 well as a sufficient number of adjacent nucleotides, to provide a primer sequence of 
sufficient size and sequence complexity to form a stable duplex on both sides of the 
deletion junction being traversed. Typically, a primer of about 17 to about 75 nucleotides 
or more in length is preferred, with about 10 to about 25 or more residues on both sides 
of the junction of the sequence being altered. 

20 In general, the technique of site-specific mutagenesis is well known in the art, as 

exemplified by various publications. As will be appreciated, the technique typically 
employs a phage vector which exists in both a single stranded and double stranded form. 
Typical vectors useful in site-directed mutagenesis include vectors such as the Ml 3 
phage. These phage are readily commercially available and their use is generally well 

25 known to those skilled in the art. Double stranded plasmids also are routinely employed 
in site directed mutagenesis which eliminates the step of transferring the gene of interest 
from a plasmid to a phage. 

Site-directed mutagenesis in accordance herewith typically is performed by first 
obtaining a single-stranded vector or melting apart of two strands of a double-stranded 

30 vector which includes within its sequence a DNA sequence that includes the s-ship 
promoter. An oligonucleotide primer bearing the desired mutated sequence is prepared, 
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generally synthetically. This primer is then annealed with the single-stranded vector, and 
subjected to DNA polymerizing enzymes such as the E. coli polymerase I Klenow 
fragment, in order to complete the synthesis of the mutation-bearing strand. Thus, a 
heteroduplex is formed wherein one strand encodes the original non-mutated sequence 
5 and the second strand bears the desired mutation. This heteroduplex vector is then used 
to transform or transfect appropriate cells, such as E. coli cells, and cells are selected 
which include recombinant vectors bearing the mutated sequence arrangement. Vector 
DNA can then be isolated from these cells and used for plant transformation. A genetic 
selection scheme was devised by Kunkel et al. (1987) to enrich for clones incorporating 

10 mutagenic oligonucleotides. Alternatively, the use of PCR™ with commercially 
available thermostable enzymes such as Taq polymerase may be used to incorporate a 
mutagenic oligonucleotide primer into an amplified DNA fragment that can then be 
cloned into an appropriate cloning or expression vector. The PCR™-mediated 
mutagenesis procedures of Tomic et al. (1990) and Upender et al. (1995) provide two 

15 examples of such protocols. A PCR™ employing a thermostable ligase in addition to a 
thermostable polymerase also may be used to incorporate a phosphorylated mutagenic 
oligonucleotide into an amplified DNA fragment that may then be cloned into an 
appropriate cloning or expression vector. 

The preparation of sequence variants of the selected promoter DNA segments 

20 using site-directed mutagenesis is provided as a means of producing potentially useful 
promoter sequences and is not meant to be limiting as there are other ways in which 
sequence variants of DNA sequences may be obtained. For example, recombinant 
vectors encoding the desired promoter sequence may be treated with mutagenic agents, 
such as hydroxylamine, to obtain sequence variants. 

25 As used herein, the term "oligonucleotide-directed mutagenesis procedure" refers 

to template-dependent processes and vector-mediated propagation which result in an 
increase in the concentration of a specific nucleic acid molecule relative to its initial 
concentration, or in an increase in the concentration of a detectable signal, such as 
amplification. As used herein, the term "oligonucleotide directed mutagenesis 

30 procedure" also is intended to refer to a process that involves the template-dependent 
extension of a primer molecule. The term template-dependent process refers to nucleic 
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acid synthesis of an RNA or a DNA molecule wherein the sequence of the newly 
synthesized strand of nucleic acid is dictated by the well-known rules of complementary 
base pairing (see, for example, Watson and Ramstad, 1987). Typically, vector mediated 
methodologies involve the introduction of the nucleic acid fragment into a DNA or RNA 
5 vector, the clonal amplification of the vector, and the recovery of the amplified nucleic 
acid fragment. Examples of such methodologies are provided by U.S. Patent No. 
4,237,224, specifically incorporated herein by reference in its entirety. A number of 
template dependent processes are available to amplify the target sequences of interest 
present in a sample, such methods being well known in the art and specifically disclosed 
10 herein below. 

One efficient, targeted means for preparing mutagenized promoters or enhancers 
relies upon the identification of putative regulatory elements within the target sequence. 
This can be initiated by comparison with, for example, promoter sequences known to be 
expressed in a similar manner. Sequences which are shared among elements with similar 

15 functions or expression patterns are likely candidates for the binding of transcription 
factors and are thus likely elements which confer expression patterns. Confirmation of 
these putative regulatory elements can be achieved by deletion analysis of each putative 
regulatory region followed by functional analysis of each deletion construct by assay of a 
reporter gene which is functionally attached to each construct. As such, once a starting 

20 promoter or intron sequence is provided, any of a number of different functional deletion 
mutants of the starting sequence could be readily prepared. 

As indicated above, deletion mutants of the s-ship promoter also could be 
randomly prepared and then assayed. With this strategy, a series of constructs are 
prepared, each containing a different portion of the clone (a subclone), and these 

25 constructs are then screened for activity. A suitable means for screening for activity is to 
attach a deleted promoter construct to a selectable or screenable marker, and to isolate 
only those cells expressing the marker protein. In this way, a number of different, deleted 
promoter constructs are identified which still retain the desired, or even enhanced, 
activity. The smallest segment which is required for activity is thereby identified through 

30 comparison of the selected constructs. This segment may then be used for the 
construction of vectors for the expression of exogenous protein. 
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1. Vectors 

Promoter sequences of the invention may be comprised in a vector. The term 
"vector" is used to refer to a carrier nucleic acid molecule into which a nucleic acid 
sequence can be inserted for introduction into a cell where it can be replicated. A nucleic 
5 acid sequence can be "exogenous," which means that it is foreign to the cell into which 
the vector is being introduced or that the sequence is homologous to a sequence in the 
cell but in a position within the host cell nucleic acid in which the sequence is ordinarily 
not found. Vectors include plasmids, cosmids, viruses (bacteriophage, animal viruses, 
and plant viruses), and artificial chromosomes (e.g., YACs). One of skill in the art would 

10 be well equipped to construct a vector through standard recombinant techniques, which 
are described in Sambrook et al, (1989) and Ausubel et al, 1996, both incorporated 
herein by reference. In addition to encoding a polypeptide, a vector may encode other 
polypeptide sequences such as a tag or targetting molecule. Useful vectors encoding such 
fusion proteins include pIN vectors (Inouye et al, 1985), vectors encoding a stretch of 

15 histidines, and pGEX vectors, for use in generating glutathione S-transferase (GST) soluble 
fusion proteins for later purification and separation or cleavage. A targetting molecule is 
one that directs the modified polypeptide to a particular organ, tissue, cell, or other location 
in a subject's body. 

The term "expression vector" refers to a vector containing a nucleic acid sequence 
20 coding for at least part of a gene product capable of being transcribed. In some cases, 
RNA molecules are then translated into a protein, polypeptide, or peptide. In other cases, 
these sequences are not translated, for example, in the production of antisense molecules, 
siRNA molecules or miRNA molecules. In addition to s-ship promoter regions, 
expression vectors can contain a variety of other "control sequences," which refer to 
25 nucleic acid sequences necessary for the transcription and possibly translation of an 
operably linked coding sequence in a particular host organism. In addition to control 
sequences that govern transcription and translation, vectors and expression vectors may 
contain nucleic acid sequences that serve other functions as well and are described infra. 

In certain embodiments of the invention, the expression vector comprises a virus 
30 or engineered vector derived from a viral genome. The ability of certain viruses to enter 
cells via receptor-mediated endocytosis, to integrate into host cell genome and express 
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viral genes stably and efficiently have made them attractive candidates for the transfer of 
foreign genes into mammalian cells (Ridgeway, 1988; Nicolas and Rubenstein, 1988; 
Baichwal and Sugden, 1986; Temin, 1986). The first viruses used as gene vectors were 
DNA viruses including the papovaviruses (simian virus 40, bovine papilloma virus, and 
5 polyoma) (Ridgeway, 1988; Baichwal and Sugden, 1986) and adenoviruses (Ridgeway, 
1988; Baichwal and Sugden, 1986). These have a relatively low capacity for foreign 
DNA sequences and have a restricted host spectrum. Furthermore, their oncogenic 
potential and cytopathic effects in permissive cells raise safety concerns. They can 
accommodate only up to 8 kb of foreign genetic material but can be readily introduced in 
10 a variety of cell lines and laboratory animals (Nicolas and Rubenstein, 1988; Temin, 
1986). 

The retroviruses are a group of single-stranded RNA viruses characterized by an 
ability to convert their RNA to double-stranded DNA in infected cells; they can also be 
used as vectors. Other viral vectors may be employed as expression constructs in the 

15 present invention. Vectors derived from viruses such as vaccinia virus (Ridgeway, 1988; 
Baichwal and Sugden, 1986; Coupar et al, 1988) adeno-associated virus (AAV) 
(Ridgeway, 1988; Baichwal and Sugden, 1986; Hermonat and Muzycska, 1984) and 
herpesviruses may be employed. They offer several attractive features for various 
mammalian cells (Friedmann, 1989; Ridgeway, 1988; Baichwal and Sugden, 1986; 

20 Coupar et al, 1988; Horwich et al, 1990). 

a. Promoters and Enhancers 

A promoter may be one naturally associated with a gene or sequence, as may be 
obtained by isolating the 5' non-coding sequences located upstream of the coding 
segment and/or exon. Such a promoter can be referred to as "endogenous." Similarly, an 

25 enhancer may be one naturally associated with a nucleic acid sequence, located either 
downstream or upstream of that sequence. Alternatively, certain advantages will be 
gained by positioning the coding nucleic acid segment under the control of a recombinant 
or heterologous promoter, which refers to a promoter that is not normally associated with 
a nucleic acid sequence in its natural environment. A recombinant or heterologous 

30 enhancer refers also to an enhancer not normally associated with a nucleic acid sequence 
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in its natural environment. Such promoters or enhancers may include promoters or 
enhancers of other genes, and promoters or enhancers isolated from any other 
prokaryotic, viral, or eukaryotic cell, and promoters or enhancers not "naturally 
occurring," i.e., containing different elements of different transcriptional regulatory 
5 regions, and/or mutations that alter expression. In addition to producing nucleic acid 
sequences of promoters and enhancers synthetically, sequences may be produced using 
recombinant cloning and/or nucleic acid amplification technology, including PCR™, in 
connection with the compositions disclosed herein (see U.S. Patent 4,683,202, U.S. 
Patent 5,928,906, each incorporated herein by reference). Furthermore, it is 

10 contemplated the control sequences that direct transcription and/or expression of 
sequences within non-nuclear organelles such as mitochondria, chloroplasts, and the like, 
can be employed as well. 

Naturally, it may be important to employ a promoter and/or enhancer that 
effectively directs the expression of the DNA segment in the cell type, organelle, and 

15 organism chosen for expression. Those of skill in the art of molecular biology generally 
know the use of promoters, enhancers, and cell type combinations for protein expression, 
for example, see Sambrook et al. (1989), incorporated herein by reference. The 
promoters employed may be constitutive, tissue-specific, inducible, and/or useful under 
the appropriate conditions to direct high level expression of the introduced DNA 

20 segment, such as is advantageous in the large-scale production of recombinant proteins 
and/or peptides. The promoter may be heterologous or endogenous. 

In addition to the s-ship promoter, other elements/promoters may be employed, in 
the context of the present invention, to regulate the expression of a gene. Table 1 is a list 
of other promoters and enhancers that may be used in conjunction with the s-ship 

25 promoter of the invention; this list also identifies references that indicate how promoters 
can be evaluated. It is not intended to be exhaustive of all the possible elements involved 
in the promotion of expression but, merely, to be exemplary thereof. Table 2 provides 
examples of inducible elements, which are regions of a nucleic acid sequence that can be 
activated in response to a specific stimulus. 

30 
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TABLE 1 


Promoter and/or Enhancer 


Promoter/Enhancer 


References 


Immunoglobulin Heavy Chain 


Banerji et al, 1983; Gilles et al, 1983; Grosschedl 
et al, 1985; Atchinson et al, 1986, 1987; Imler et 
al, 1987; Weinberger et al, 1984; Kiledjian et al, 
1988; Porton et al; 1990 


Tm m iinn cr1 r^Vii 1 1 m T lolit f'hsiin 
liliiXlUJ.lLfgl(JLIUlill J_/J.£^1J.L Villain 


Onppn &i nl 1 QRV PirarH &t nl 1 QRd 




T nria £>t nl 1 QR7* Winntn of nl 1 ORQ- RpHnnHn oi 

a/.; 1990 


HLADQ a and/or DQp 


Sullivan a/., 1987 


P-Interferon 


Goodboum £tf al, 1986; Fujita a/., 1987; 
Goodboum et al 1988 


Interleukin-2 


Greene et al, 1989 


Interleukin-2 Receptor 


Greene al, 1989; Lin ef a/., 1990 


MHC Class 0 5 


Koch etal, 1989 


MHC Class H HLA-DRa 


Sherman etal, 1989 


P-Actin 


Kawamoto er a/., 1988; Ng et al ; 1989 


Muscle Creatine Kinase (MCK) 


Jaynes et al, 1988; Horlick et al, 1989; Johnson et 

nl 1QSQ 

at., iyoy 


Prealbumin (Transthyretin) 


Costa a/., 1988 


Elastase I 


Omitze/a/., 1987 


Metallothionein (MTII) 


Karin et al, 1987; Culotta et al, 1989 


Collagenase 


Pinkert al, 1987; Angel e/ a/., 1987 


Albumin 


Pinkert et al, 1987; Tronche et al, 1989, 1990 


a-Fetoprotein 


Godbout et al, 1988; Campere et al, 1989 


t-Globin 


Bodine <tf al, 1987; Perez-Stable a/., 1990 


p-Globin 


Trudelefa/., 1987 


c-fos 


Cohen al, 1987 


c-HA-ras 


Triesman, 1986; Deschamps et al, 1985 


Insulin 


EdlundeJ a/., 1985 


Neural Cell Adhesion Molecule 
(NCAM) 


Hirsh etal, 1990 
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TABLE 1 


Promoter and/or Enhancer 


Promoter/Enhancer 


References 


cci-Antitrypain 


Latimer etal, 1990 


H2B (TH2B) Histone 


Hwang et al, 1990 


Mouse and/or Type I Collagen 


Ripe etal., 1989 


Glucose-Regulated Proteins 
^uxvry't anu oxvr lo) 


Change^/., 1989 


Rat Growth Hormone 


Larsen et al, 1986 


Human Serum Amyloid A (SAA) 


Edbrookeefa/., 1989 


Troponin I (TN I) 


Yutzeye/a/., 1989 


Platelet-Derived Growth Factor 


Pechefa/., 1989 


(PDGF) 




Duchenne Muscular Dystrophy 


Klamut etal, 1990 


SV40 


Banerji et al, 1981; Moreau et al, 1981; Sleigh et 
al, 1985; Firak et al, 1986; Herr et al, 1986; Imbra 
et al, 1986; Kadesch et al, 1986; Wang et al, 
1986; Ondek et al, 1987; Kuhl et al, 1987; 
Schafmere/a/., 1988 


Polyoma 


Swartzendruber et al, 1975; Vasseur et al, 1980; 
Katinka et al, 1980, 1981; Tyndell et al, 1981; 
Dandolo et al, 1983; de Villiers et al, 1984; Hen et 
al, 1986; Satake et al, 1988; Campbell et al, 1988 


Retroviruses 


Kriegler et al, 1982, 1983; Levinson et al, 1982; 
Kriegler et al, 1983, 1984a, b, 1988; Bosze et al, 
1986; Miksicek et al, 1986; Celander et al, 1987; 
Thiesen et al, 1988; Celander et al, 1988; Choi et 
al, 1988; Reisman etal, 1989 


Papilloma Virus 


Campo et al, 1983; Lusky et al, 1983; Spandidos 
and Wilkie, 1983; Spalholz et al, 1985; Lusky et 
al, 1986; Cripe et al, 1987; Gloss et al, 1987; 
Hirochika et al, 1987; Stephens et al, 1987 


Hepatitis B Virus 


Bulla et al, 1986; Jameel et al, 1986; Shaul et al, 
1987; Spandau et al, 1988; Vannice et al, 1988 


Human Immunodeficiency Virus 


Muesing et al, 1987; Hauber et al, 1988; 
Jakobovits et al, 1988; Feng et al, 1988; Takebe et 
al, 1988; Rosen et al, 1988; Berkhout et al, 1989; 
Laspia et al, 1989; Sharp et al, 1989; Braddock et 
al, 1989 
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TABLE 1 

Promoter and/or Enhancer 


Promoter/Enhancer 


References 


Cytomegalovirus (CMV) 


Weber et al, 1984; Boshart et al, 1985; Foecking 
etal, 1986 


Gibbon Ape Leukemia Virus 


Holbrook et al, 1987; Quinn et al, 1989 



TABLE 2 

Inducible Elements 


Element 


Inducer 


References 


MTII 


Phorbol Ester (TFA) 
Heavy metals 


Palmiter et al, 1982; 
Haslinger et al, 1985; 
Searle et al, 1985; Stuart et 
al, 1985; Imagawa et al, 
1987, Karin et al, 1987; 

A „ 1 . I 1 AO *7"L . 

Angel et al, 1987b; 
McNeall etal, 1989 


MMTV (mouse mammary 
tumor vinisl 


Glucocorticoids 


Huang et al, 1981; Lee et 
al 1981" Maiors et al 
1983; Chandler et al, 1983; 
Lee et al, 1984; Ponta et 
al, 1985; Sakaie/a/., 1988 


P-Interferon 


poly(rI)x 
poly(rc) 


Tavernier et al, 1983 


Adenovirus 5 E2 


E1A 


Imperiale et al, 1984 


Collagenase 


Phorbol Ester (TP A) 


Angel et al, 1987a 


Stromelysin 


Phorbol Ester (TP A) 


Angela a/., 1987b 


SV40 


Phorbol Ester (TP A) 


Angela a/., 1987b 


Murine MX Gene 


Interferon, Newcastle 
Disease Virus 


Hug et al, 1988 


GRP78 Gene 


A23187 


Resendez et al, 1988 


a-2-Macroglobulin 


IL-6 


Kunzefa/., 1989 


Vimentin 


Serum 


Rittlingef al, 1989 


MHC Class I Gene H-2Kb 


Interferon 


Blanarefa/., 1989 
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TABLE 2 

Inducible Elements 


Element 


Inducer 


References 


HSP70 


E1A, SV40 Large T 
Antigen 


Taylor et al, 1989, 1990a, 
1990b 


Proliferin 


Phorbol Ester-TPA 


Mordacq et al, 1989 


Tumor Necrosis Factor 


PMA 


Hensel etal, 1989 


Thyroid Stimulating 
Hormone a Gene 


Thyroid Hormone 


Chatterjee etal, 1989 



The identity of tissue-specific promoters or elements, as well as assays to 
characterize their activity, is well known to those of skill in the art. Examples of such 
regions include the human LIMK2 gene (Nomoto et al 1999), the somatostatin receptor 
5 2 gene (Kraus et al, 1998), murine epididymal retinoic acid-binding gene (Lareyre et al, 
1999), human CD4 (Zhao-Emonet et al, 1998), mouse alpha2 (XI) collagen (Tsumaki, et 
al, 1998), D1A dopamine receptor gene (Lee, et al, 1997), insulin-like growth factor II 
(Wu et al, 1997), human platelet endothelial cell adhesion molecule- 1 (Almendro et al, 
1996), and the SM22a promoter. 

10 b. Initiation Signals and Internal Ribosome Binding Sites 

A specific initiation signal also may be required for efficient translation of coding 
sequences. These signals include the ATG initiation codon or adjacent sequences. 
Exogenous translational control signals, including the ATG initiation codon, may need to 
be provided. One of ordinary skill in the art would readily be capable of determining this 

15 and providing the necessary signals. It is well known that the initiation codon must be 
"in-frame" with the reading frame of the desired coding sequence to ensure translation of 
the entire insert. The exogenous translational control signals and initiation codons can be 
either natural or synthetic. The efficiency of expression may be enhanced by the 
inclusion of appropriate transcription enhancer elements. 

20 In certain embodiments of the invention, the use of internal ribosome entry sites 

(IRES) elements are used to create multigene, or polycistronic, messages. IRES elements 
are able to bypass the ribosome scanning model of 5'- methylated Cap dependent 
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translation and begin translation at internal sites (Pelletier and Sonenberg, 1988). IRES 
elements from two members of the picornavirus family (polio and encephalomyocarditis) 
have been described (Pelletier and Sonenberg, 1988), as well an IRES from a mammalian 
message (Macejak and Sarnow, 1991). IRES elements can be linked to heterologous 
5 open reading frames. Multiple open reading frames can be transcribed together, each 
separated by an IRES, creating polycistronic messages. By virtue of the IRES element, 
each open reading frame is accessible to ribosomes for efficient translation. Multiple 
genes can be efficiently expressed using a single promoter/enhancer to transcribe a single 
message (see U.S. Patent 5,925,565 and 5,935,819, herein incorporated by reference). 

10 c. Multiple Cloning Sites 

Vectors can include a multiple cloning site (MCS), which is a nucleic acid region 
that contains multiple restriction enzyme sites, any of which can be used in conjunction 
with standard recombinant technology to digest the vector. (See Carbonelli et al, 1999, 
Levenson et al, 1998, and Cocea, 1997, incorporated herein by reference.) "Restriction 

1 5 enzyme digestion" refers to catalytic cleavage of a nucleic acid molecule with an enzyme 
that functions only at specific locations in a nucleic acid molecule. Many of these 
restriction enzymes are commercially available. Use of such enzymes is widely 
understood by those of skill in the art. Frequently, a vector is linearized or fragmented 
using a restriction enzyme that cuts within the MCS to enable exogenous sequences to be 

20 ligated to the vector. "Ligation" refers to the process of forming phosphodiester bonds 
between two nucleic acid fragments, which may or may not be contiguous with each 
other. Techniques involving restriction enzymes and ligation reactions are well known to 
those of skill in the art of recombinant technology. 

d. Splicing Sites 

25 Most transcribed eukaryotic RNA molecules will undergo RNA splicing to 

remove introns from the primary transcripts. Vectors containing genomic eukaryotic 
sequences may require donor and/or acceptor splicing sites to ensure proper processing of 
the transcript for protein expression. (See Chandler et al, 1997, incorporated herein by 
reference.) 
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e. Termination Signals 

The vectors or constructs of the present invention will generally comprise at least 
one termination signal. A "termination signal" or "terminator" is comprised of the DNA 
sequences involved in specific termination of an RNA transcript by an RNA polymerase. 
5 Thus, in certain embodiments a termination signal that ends the production of an RNA 
transcript is contemplated. A terminator may be necessary in vivo to achieve desirable 
message levels. 

In eukaryotic systems, the terminator region may also comprise specific DNA 
sequences that permit site-specific cleavage of the new transcript so as to expose a 

10 polyadenylation site. This signals a specialized endogenous polymerase to add a stretch 
of about 200 A residues (polyA) to the 3' end of the transcript. RNA molecules modified 
with this polyA tail appear to more stable and are translated more efficiently. Thus, in 
other embodiments involving eukaryotes, it is preferred that that terminator comprises a 
signal for the cleavage of the RNA, and it is more preferred that the terminator signal 

15 promotes polyadenylation of the message. The terminator and/or polyadenylation site 
elements can serve to enhance message levels and/or to minimize read through from the 
cassette into other sequences. 

Terminators contemplated for use in the invention include any known terminator 
of transcription described herein or known to one of ordinary skill in the art, including 

20 but not limited to, for example, the termination sequences of genes, such as for example 
the bovine growth hormone terminator or viral termination sequences, such as for 
example the SV40 terminator. In certain embodiments, the termination signal may be a 
lack of transcribable or translatable sequence, such as due to a sequence truncation. 

f. Polyadenylation Signals 

25 In expression, particularly eukaryotic expression, one will typically include a 

polyadenylation signal to effect proper polyadenylation of the transcript. The nature of 
the polyadenylation signal is not believed to be crucial to the successful practice of the 
invention, and/or any such sequence may be employed. Preferred embodiments include 
the SV40 polyadenylation signal and/or the bovine growth hormone polyadenylation 

30 signal, convenient and/or known to function well in various target cells. Polyadenylation 
may increase the stability of the transcript or may facilitate cytoplasmic transport. 
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g. Origins of Replication 

In order to propagate a vector in a host cell, it may contain one or more origins of 
replication sites (often termed "ori"), which is a specific nucleic acid sequence at which 
replication is initiated. Alternatively an autonomously replicating sequence (ARS) can be 
5 employed if the host cell is yeast. 

h. Selectable and Screenable Markers 

In certain embodiments of the invention, cells containing a nucleic acid construct 
of the present invention may be identified in vitro or in vivo by including a marker in the 
expression vector. Such markers would confer an identifiable change to the cell 

10 permitting easy identification of cells containing the expression vector. Generally, a 
selectable marker is one that confers a property that allows for selection. A positive 
selectable marker is one in which the presence of the marker allows for its selection, 
while a negative selectable marker is one in which its presence prevents its selection. An 
example of a positive selectable marker is a drug resistance marker. 

15 Usually the inclusion of a drug selection marker aids in the cloning and 

identification of transformants, for example, genes that confer resistance to neomycin, 
puromycin, hygromycin, DHFR, GPT, zeocin and histidinol are useful selectable 
markers. 

In addition to markers conferring a phenotype that allows for the discrimination of 
20 transformants based on the implementation of conditions, other types of markers 
including screenable markers such as GFP, whose basis is colorimetric analysis, are also 
contemplated. Alternatively, screenable enzymes such as herpes simplex virus thymidine 
kinase (tk), chloramphenicol acetyltransferase (CAT), or luciferase may be utilized. One 
of skill in the art would also know how to employ immunologic markers, possibly in 
25 conjunction with FACS analysis. The marker used is not believed to be important, so 
long as it is capable of being expressed simultaneously with the nucleic acid encoding a 
gene product. Further examples of selectable and screenable markers are well known to 
one of skill in the art. 



25395996.1 

-32- 



B. Host Cells 

As used herein, the terms "cell," "cell line," and "cell culture" may be used 
interchangeably. All of these terms also include their progeny, which is any and all 
subsequent generations. It is understood that all progeny may not be identical due to 
5 deliberate or inadvertent mutations. In the context of expressing a heterologous nucleic 
acid sequence, "host cell" refers to a prokaryotic or eukaryotic cell, and it includes any 
transformable organisms that is capable of replicating a vector and/or expressing a 
heterologous gene encoded by a vector. A host cell can, and has been, used as a recipient 
for vectors. A host cell may be "transfected" or "transformed," which refers to a process 

10 by which exogenous nucleic acid, such as a modified protein-encoding sequence, is 
transferred or introduced into the host cell. A transformed cell includes the primary 
subject cell and its progeny. 

Host cells may be derived from prokaryotes or eukaryotes, including yeast cells, 
insect cells, and mammalian cells, depending upon whether the desired result is 

15 replication of the vector or expression of part or all of the vector-encoded nucleic acid 
sequences. Numerous cell lines and cultures are available for use as a host cell, and they 
can be obtained through the American Type Culture Collection (ATCC), which is an 
organization that serves as an archive for living cultures and genetic materials 
(www.atcc.org). An appropriate host can be determined by one of skill in the art based 

20 on the vector backbone and the desired result. A plasmid or cosmid, for example, can be 
introduced into a prokaryote host cell for replication of many vectors. Bacterial cells 
used as host cells for vector replication and/or expression include DH5a, JM109, and 
KC8, as well as a number of commercially available bacterial hosts such as SURE® 
Competent Cells and Solopack™ Gold Cells (Stratagene®, La Jolla, CA). 

25 Alternatively, bacterial cells such as E. coli LE392 could be used as host cells for phage 
viruses. Appropriate yeast cells include Saccharomyces cerevisiae, Saccharomyces 
pombe, and Pichia pastoris. 

Examples of eukaryotic host cells for replication and/or expression of a vector 
include HeLa, NIH3T3, Jurkat, 293, Cos, CHO, Saos, and PC 12. Stem cell lines and 

30 other immature cell lines are specifically contemplated as suitable host cells of the 
invention. Many host cells from various cell types and organisms are available and would 
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be known to one of skill in the art. Similarly, a viral vector may be used in conjunction 
with either a eukaryotic or prokaryotic host cell, particularly one that is permissive for 
replication or expression of the vector. 

Some vectors may employ control sequences that allow it to be replicated and/or 
5 expressed in both prokaryotic and eukaryotic cells. One of skill in the art would further 
understand the conditions under which to incubate all of the above described host cells to 
maintain them and to permit replication of a vector. Also understood and known are 
techniques and conditions that would allow large-scale production of vectors, as well as 
production of the nucleic acids encoded by vectors and their cognate polypeptides, 
10 proteins, or peptides. 

C. Assays of Transgene Expression 

Assays may be employed with the instant invention for determination of the 
relative efficiency of transgene expression. For example, assays may be used to 
determine the efficacy of deletion mutants of the s-ship promoter in directing expression 

1 5 of exogenous proteins. Similarly, one could produce random or site-specific mutants of 
the s-ship promoter of the invention and assay the efficacy of the mutants in the 
expression of a given transgene. Alternatively, assays could be used to determine the 
efficacy of the s-ship promoter in directing protein expression when used in conjunction 
with various different enhancers, terminators or other types of elements potentially used 

20 in the preparation of transformation constructs. 

For mammals, expression assays may comprise a system utilizing cell lines, or 
alternatively, whole organisms. Additionally, assays of tissue or developmental specific 
promoters are generally feasible. 

The biological sample to be assayed may comprise nucleic acids isolated from the 

25 cells of any plant material according to standard methodologies (Sambrook et al, 1989). 
The nucleic acid may be genomic DNA or fractionated or whole cell RNA. Where RNA 
is used, it may be desired to convert the RNA to a complementary DNA. In one 
embodiment of the invention, the RNA is whole cell RNA; in another, it is poly- A RNA. 
Normally, the nucleic acid is amplified. 

30 Depending on the format, the specific nucleic acid of interest is identified in the 

sample directly using amplification or with a second, known nucleic acid following 
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amplification. Next, the identified product is detected. In certain applications, the 
detection may be performed by visual means (e.g., ethidium bromide staining of a gel). 
Alternatively, the detection may involve indirect identification of the product via 
chemiluminescence, radioactive scintigraphy of radiolabel or fluorescent label or even 
5 via a system using electrical or thermal impulse signals (Affymax Technology; Bellus, 
1994). 

Following detection, one may compare the results seen in a given sample with a 
statistically significant reference group of non-transformed control cells. Typically, the 
non-transformed control cells will be of a genetic background similar to the transformed 
10 cells. In this way, it is possible to detect differences in the amount or kind of protein 
detected in various transformed cells. 

As indicated, a variety of different assays are contemplated in the screening of 
cells or animals of the current invention and associated promoters. These techniques may 
in cases be used to detect for both the presence and expression of the particular genes as 
15 well as rearrangements that may have occurred in the gene construct. The techniques 
include but are not limited to, fluorescent in situ hybridization (FISH), direct DNA 
sequencing, pulsed field gel electrophoresis (PFGE) analysis, Southern or Northern 
blotting, single-stranded conformation analysis (SSCA), RNAse protection assay, allele- 
specific oligonucleotide (ASO), dot blot analysis, denaturing gradient gel electrophoresis, 
20 RFLP and PCR™-SSCP. 

1. Quantitation of Gene Expression with Relative Quantitative 
RT- PCR™ 

Reverse transcription (RT) of RNA to cDNA followed by relative quantitative 
PCR™ (RT-PCR™) can be used to determine the relative concentrations of specific 

25 mRNA species, for example, an mRNA whose expression is controlled by an s-ship 
promoter. By determining that the concentration of a specific mRNA species varies, it 
can be shown that the gene encoding the specific mRNA species is differentially 
expressed. In this way, a promoters expression profile can be rapidly identified, as can 
the efficacy with which the promoter directs transgene expression. 

30 In PCR™, the number of molecules of the amplified target DNA increase by a 

factor approaching two with every cycle of the reaction until some reagent becomes 
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limiting. Thereafter, the rate of amplification becomes increasingly diminished until 
there is no increase in the amplified target between cycles. If a graph is plotted in which 
the cycle number is on the X axis and the log of the concentration of the amplified target 
DNA is on the Y axis, a curved line of characteristic shape is formed by connecting the 
5 plotted points. Beginning with the first cycle, the slope of the line is positive and 
constant. This is said to be the linear portion of the curve. After a reagent becomes 
limiting, the slope of the line begins to decrease and eventually becomes zero. At this 
point the concentration of the amplified target DNA becomes asymptotic to some fixed 
value. This is said to be the plateau portion of the curve. 

10 The concentration of the target DNA in the linear portion of the PCR™ 

amplification is directly proportional to the starting concentration of the target before the 
reaction began. By determining the concentration of the amplified products of the target 
DNA in PCR™ reactions that have completed the same number of cycles and are in their 
linear ranges, it is possible to determine the relative concentrations of the specific target 

15 sequence in the original DNA mixture. If the DNA mixtures are cDNAs synthesized 
from RNAs isolated from different tissues or cells, the relative abundances of the specific 
mRNA from which the target sequence was derived can be determined for the respective 
tissues or cells. This direct proportionality between the concentration of the PCR™ 
products and the relative mRNA abundances is only true in the linear range of the PCR™ 

20 reaction. 

The final concentration of the target DNA in the plateau portion of the curve is 
determined by the availability of reagents in the reaction mix and is independent of the 
original concentration of target DNA. Therefore, the first condition that must be met 
before the relative abundances of a mRNA species can be determined by RT-PCR™ for a 

25 collection of RNA populations is that the concentrations of the amplified PCR™ products 
must be sampled when the PCR™ reactions are in the linear portion of their curves. 

The second condition that must be met for an RT-PCR™ study to successfully 
determine the relative abundances of a particular mRNA species is that relative 
concentrations of the amplifiable cDNAs must be normalized to some independent 

30 standard. The goal of an RT-PCR™ study is to determine the abundance of a particular 
mRNA species relative to the average abundance of all mRNA species in the sample. 
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Most protocols for competitive PCR™ utilize internal PCR™ standards that are 
approximately as abundant as the target. These strategies are effective if the products of 
the PCR™ amplifications are sampled during their linear phases. If the products are 
sampled when the reactions are approaching the plateau phase, then the less abundant 
5 product becomes relatively over represented. Comparisons of relative abundances made 
for many different RNA samples, such as is the case when examining RNA samples for 
differential expression, become distorted in such a way as to make differences in relative 
abundances of RNAs appear less than they actually are. This is not a significant problem 
if the internal standard is much more abundant than the target. If the internal standard is 
10 more abundant than the target, then direct linear comparisons can be made between RNA 
samples. 

The above discussion describes theoretical considerations for an RT-PCR™ assay 
for plant tissue. The problems inherent in plant tissue samples are that they are of 
variable quantity (making normalization problematic), and that they are of variable 

15 quality (necessitating the co-amplification of a reliable internal control, preferably of 
larger size than the target). Both of these problems are overcome if the RT-PCR™ is 
performed as a relative quantitative RT-PCR™ with an internal standard in which the 
internal standard is an amplifiable cDNA fragment that is larger than the target cDNA 
fragment and in which the abundance of the mRNA encoding the internal standard is 

20 roughly 5-100 fold higher than the mRNA encoding the target. This assay measures 
relative abundance, not absolute abundance of the respective mRNA species. 

Other studies may be performed using a more conventional relative quantitative 
RT-PCR™ assay with an external standard protocol. These assays sample the PCR™ 
products in the linear portion of their amplification curves. The number of PCR™ cycles 

25 that are optimal for sampling must be empirically determined for each target cDNA 
fragment. In addition, the reverse transcriptase products of each RNA population isolated 
from the various tissue samples must be carefully normalized for equal concentrations of 
amplifiable cDNAs. This consideration is very important since the assay measures 
absolute mRNA abundance. Absolute mRNA abundance can be used as a measure of 

30 differential gene expression only in normalized samples. While empirical determination 
of the linear range of the amplification curve and normalization of cDNA preparations 
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are tedious and time consuming processes, the resulting RT-PCR™ assays can be 
superior to those derived from the relative quantitative RT-PCR™ assay with an internal 
standard. 

One reason for this advantage is that without the internal standard/competitor, all 
5 of the reagents can be converted into a single PCR™ product in the linear range of the 
amplification curve, thus increasing the sensitivity of the assay. Another reason is that 
with only one PCR™ product, display of the product on an electrophoretic gel or another 
display method becomes less complex, has less background and is easier to interpret. 

2. Marker Gene Expression 

10 Marker genes represent an efficient means for assaying the expression of 

transgenes. Using, for example, a selectable marker gene, one could quantitatively 
determine the expression levels in the cell using a construct comprising the selectable 
marker coding region operably linked to the promoter to be assayed, e.g., an s-ship 
promoter. Alternatively, particular cell types could be exposed to a selective agent and 

15 the relative resistance provided in these cells quantified, thereby providing an estimate of 
the tissue specific expression of the promoter. 

Screenable markers constitute another efficient means for quantifying the 
expression of a given transgene. Potentially any screenable marker could be expressed 
and the marker gene product quantified, thereby providing an estimate of the efficiency 

20 with which the promoter directs expression of the transgene. Quantification can readily 
be carried out using either visual means, or, for example, a photon counting device. 

A preferred screenable marker gene assay for use with the current invention 
include the use of the screenable marker gene P-galactosidase (P-gal), luciferase, or green 
fluorescent protein (GFP). 

25 3. Purification and Assays of Proteins 

One means for determining the efficiency with which a particular transgene is 
expressed is to purify and quantify a polypeptide expressed by the transgene. Protein 
purification techniques are well known to those of skill in the art. These techniques 
involve, at one level, the crude fractionation of the cellular milieu to polypeptide and 
30 non-polypeptide fractions. Having separated the polypeptide from other proteins, the 
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polypeptide of interest may be further purified using chromatographic and electrophoretic 
techniques to achieve partial or complete purification (or purification to homogeneity). 
Analytical methods particularly suited to the preparation of a pure peptide are ion- 
exchange chromatography, exclusion chromatography; polyacrylamide gel 
5 electrophoresis; and isoelectric focusing. A particularly efficient method of purifying 
peptides is fast protein liquid chromatography or even HPLC. 

Various techniques suitable for use in protein purification will be well known to 
those of skill in the art. These include, for example, precipitation with ammonium 
sulphate, PEG, antibodies and the like or by heat denaturation, followed by 

10 centrifugation; chromatography steps such as ion exchange, gel filtration, reverse phase, 
hydroxylapatite and affinity chromatography; isoelectric focusing; gel electrophoresis; 
and combinations of such and other techniques. As is generally known in the art, it is 
believed that the order of conducting the various purification steps may be changed, or 
that certain steps may be omitted, and still result in a suitable method for the preparation 

15 of a substantially purified protein or peptide. 

There is no general requirement that the protein or peptide being assayed always 
be provided in their most purified state. Indeed, it is contemplated that less substantially 
purified products will have utility in certain embodiments. Partial purification may be 
accomplished by using fewer purification steps in combination, or by utilizing different 

20 forms of the same general purification scheme. For example, it is appreciated that a 
cation-exchange column chromatography performed utilizing an HPLC apparatus will 
generally result in a greater "-fold" purification than the same technique utilizing a low 
pressure chromatography system. Methods exhibiting a lower degree of relative 
purification may have advantages in total recovery of protein product, or in maintaining 

25 the activity of an expressed protein. 

It is known that the migration of a polypeptide can vary, sometimes significantly, 
with different conditions of SDS/PAGE (Capaldi et al, 1977). It will therefore be 
appreciated that under differing electrophoresis conditions, the apparent molecular 
weights of purified or partially purified expression products may vary. 

30 High Performance Liquid Chromatography (HPLC) is characterized by a very 

rapid separation with extraordinary resolution of peaks. This is achieved by the use of 
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very fine particles and high pressure to maintain an adequate flow rate. Separation can be 
accomplished in a matter of minutes, or at most an hour. Moreover, only a very small 
volume of the sample is needed because the particles are so small and close-packed that 
the void volume is a very small fraction of the bed volume. Also, the concentration of 
5 the sample need not be very great because the bands are so narrow that there is very little 
dilution of the sample. 

Gel chromatography, or molecular sieve chromatography, is a special type of 
partition chromatography that is based on molecular size. The theory behind gel 
chromatography is that the column, which is prepared with tiny particles of an inert 

10 substance that contain small pores, separates larger molecules from smaller molecules as 
they pass through or around the pores, depending on their size. As long as the material of 
which the particles are made does not adsorb the molecules, the sole factor determining 
rate of flow is the size. Hence, molecules are eluted from the column in decreasing size, 
so long as the shape is relatively constant. Gel chromatography is unsurpassed for 

15 separating molecules of different size because separation is independent of all other 
factors such as pH, ionic strength, temperature, etc. There also is virtually no adsorption, 
less zone spreading and the elution volume is related in a simple matter to molecular 
weight. 

Affinity Chromatography is a chromatographic procedure that relies on the 
20 specific affinity between a substance to be isolated and a molecule that it can specifically 
bind to. This is a receptor-ligand type interaction. The column material is synthesized by 
covalently coupling one of the binding partners to an insoluble matrix. The column 
material is then able to specifically adsorb the substance from the solution. Elution 
occurs by changing the conditions to those in which binding will not occur (alter pH, 
25 ionic strength, temperature, etc.). 

A particular type of affinity chromatography useful in the purification of 
carbohydrate containing compounds is lectin affinity chromatography. Lectins are a class 
of substances that bind to a variety of polysaccharides and glycoproteins. 

The matrix should be a substance that itself does not adsorb molecules to any 
30 significant extent and that has a broad range of chemical, physical and thermal stability. 
The ligand should be coupled in such a way as to not affect its binding properties. The 

25395996.1 

-40- 



ligand should also provide relatively tight binding. And it should be possible to elute the 
substance without destroying the sample or the ligand. One of the most common forms 
of affinity chromatography is irnmunoaffinity chromatography. The generation of 
antibodies that would be suitable for use in accord with the present invention is well 
5 known to those of skill in the art. 

D. Methods of Gene Transfer 

Suitable methods for nucleic acid delivery to effect expression of compositions of 
the present invention are believed to include virtually any method by which a nucleic acid 
(e.g., DNA, including viral and nonviral vectors) can be introduced into an organelle, a 

10 cell, a tissue or an organism, as described herein or as would be known to one of ordinary 
skill in the art. Such methods include, but are not limited to, direct delivery of DNA such 
as by injection (U.S. Patents 5,994,624, 5,981,274, 5,945,100, 5,780,448, 5,736,524, 
5,702,932, 5,656,610, 5,589,466 and 5,580,859, each incorporated herein by reference), 
including microinjection (Harlan and Weintraub, 1985; U.S. Patent 5,789,215, 

15 incorporated herein by reference); by electroporation (U.S. Patent No. 5,384,253, 
incorporated herein by reference); by calcium phosphate precipitation (Graham and Van 
Der Eb, 1973; Chen and Okayama, 1987; Ripped al, 1990); by using DEAE-dextran 
followed by polyethylene glycol (Gopal, 1985); by direct sonic loading (Fechheimer et 
al, 1987); by liposome mediated transfection (Nicolau and Sene, 1982; Fraleyef 

20 al, 1979; Nicolau et al, 1987; Wong et al, 1980; Kaneda et al, 1989; Kato et al, 1991); 
by microprojectile bombardment (PCT Application Nos. WO 94/09699 and 95/06128; 
U.S. Patents 5,610,042; 5,322,783 5,563,055, 5,550,318, 5,538,877 and 5,538,880, and 
each incorporated herein by reference); by agitation with silicon carbide fibers 
(Kaeppleref al, 1990; U.S. Patents 5,302,523 and 5,464,765, each incorporated herein 

25 by reference); by desiccation/inhibition-mediated DNA uptake (Potrykusef al, 1985). 
Through the application of techniques such as these, organelle(s), cell(s), tissue(s) or 
organism(s) may be stably or transiently transformed. 

E. Transgenic and Knockout Animals 
1. Transgenic Animals 

30 It is further contemplated that transgenic animals are part of the present invention. 

A transgenic animal of the present invention may involve an animal in which an s-ship 
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promoter drives the expression of a transgene. The transgene can be expressed 
temporally or spatially in a manner different than or the same as a non-transgenic animal. 
The transgene may also be heterologous with respect to the host cell or organism, such 
as, for example, the luciferase gene in a mammalian cell. Moreover, it is contemplated 
5 that the transgene may be expressed in a different tissue type or in a different amount or 
at a different time than the endogenously expressed version of the transgene. 

In a general aspect, a transgenic animal is produced by the integration of a given 
transgene into the genome in a manner that permits the expression of the transgene, or by 
disrupting the wild-type gene, leading to a knockout of the wild-type gene. Methods for 

10 producing transgenic animals are generally described by Wagner and Hoppe (U.S. Patent 
No. 4,873,191; which is incorporated herein by reference), Brinster et al. (1985; which is 
incorporated herein by reference in its entirety) and in "Manipulating the Mouse Embryo; 
A Laboratory Manual" 2nd edition (eds., Hogan, Beddington, Costantimi and Long, Cold 
Spring Harbor Laboratory Press, 1994; which is incorporated herein by reference in its 

15 entirety). 

U.S. Patent 5,639,457 is also incorporated herein by reference to supplement the 
present teaching regarding transgenic pig and rabbit production. U.S. Patents 5,175,384; 
5,175,385; 5,530,179, 5,625,125, 5,612,486 and 5,565,186 are also each incorporated 
herein by reference to similarly supplement the present teaching regarding transgenic 
20 mouse and rat production. Transgenic animals may be crossed with other transgenic 
animals or knockout animals to evaluate phenotype based on compound alterations in the 
genome. 

2. Knockout Animals or Cells 

The generation of an animal model lacking s-ship or a particular nucleic acid 
25 (encoding an RNA that is translated or not) is contemplated as part of the present 
invention to understand further stem cell function. This strategy could also be 
implemented in cell culture as well. 

The lack of activity as a result of the knockout may provoke various types of 
pathophysiological disturbances in a knockout animal or cell. This can be used to 
30 characterize the role or function of a particular gene product at a particular time in 
development or in a particular cell type. Use of the s-ship promoter can be used to drive 
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the expression of the knockout gene such that only certain cells, for example stem cells, 
may be affected. One method of inhibiting the endogenous expression of a particular 
gene in an animal is to disrupt the gene in germline cells and produce offspring from 
these cells. This method is generally known as knockout technology. U.S. Patent No. 
5 5,616,491, incorporated herein by reference in its entirety, generally describes the 
techniques involved in the preparation of knockout mice, and in particular describes mice 
having a suppressed level of expression of the gene encoding CD28 on T cells, and mice 
wherein the expression of the gene encoding CD45 is suppressed on B cells. Pfeffer et 
al. (1993) describe mice in which the gene encoding the tumor necrosis factor receptor 

10 p55 has been suppressed. The mice showed a decreased response to tumor necrosis 
factor signaling. Fung-Leung et al. (1991a; 1991b) describe knockout mice lacking 
expression of the gene encoding CD8. These mice were found to have a decreased level 
of cytotoxic T cell response to various antigens and to certain viral pathogens such as 
lymphocytic choriomeningitis virus. 

15 The term "knockout" refers to a partial or complete suppression of the expression 

of at least a portion of a protein encoded by an endogenous DNA sequence in a cell. The 
term "knockout construct" refers to a nucleic acid sequence that is designed to decrease 
or suppress expression of a protein encoded by endogenous DNA sequences in a cell. 
The nucleic acid sequence used as the knockout construct is typically comprised of: (1) 

20 DNA from some portion of the gene (exon sequence, intron sequence, and/or promoter 
sequence) to be suppressed, in conjunction with all or part of the s-ship promoter; and (2) 
a marker sequence used to detect the presence of the knockout construct in the cell. The 
knockout construct is inserted into a cell, and integrates with the genomic DNA of the 
cell in such a position so as to prevent or interrupt transcription of the native DNA 

25 sequence. Such insertion usually occurs by homologous recombination {i.e., regions of 
the knockout construct that are homologous to endogenous DNA sequences hybridize to 
each other when the knockout construct is inserted into the cell and recombine so that the 
knockout construct is incorporated into the corresponding position of the endogenous 
DNA). 

30 The knockout construct nucleic acid sequence may comprise 1) a full or partial 

sequence of one or more exons and/or introns of the gene to be suppressed, 2) a full or 
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partial promoter sequence of the gene to be suppressed, or 3) combinations thereof. 
Typically, the knockout construct is inserted into an embryonic stem cell (ES cell) and is 
integrated into the ES cell genomic DNA, usually by the process of homologous 
recombination. This ES cell is then injected into, and integrates with, the developing 
5 embryo. 

The phenotype of a mouse heterozygous for the knockout may lend clues as to the 
function and importance of that gene or sequence, as well as contribute an understanding 
about its physiological relevance, particularly with respect to disease states. Animals 
completely lacking the targeted gene (homozygous null) may provide additional 

10 information. Mice lacking the targeted gene may not be viable, which itself is indicative 
of the importance of that gene. Should such mice be viable (heterozygous or 
homozygous nulls), they may be crossed with other transgenic or knockout mice. 
Furthermore, knock-out mice having any phenotype that resembles a disease state may be 
used to screen or test therapeutic drugs that slow, modify, or cure conditions. As is 

15 known to the skilled artisan, a conditional knockout, wherein the gene is disrupted under 
certain conditions, is frequently used. 

3. Conditional Transgenic and Knockdown Animals and Cells 
The present invention further contemplates conditional transgenic or knockdown 
animals (or cells in culture), such as those produced using recombination methods. 

20 Bacteriophage PI Cre recombinase and flp recombinase from yeast plasmids are two 
non-limiting examples of site-specific DNA recombinase enzymes which cleave DNA at 
specific target sites (lox P sites for cre recombinase and frt sites for flp recombinase) and 
catalyze a ligation of this DNA to a second cleaved site. A large number of suitable 
alternative site-specific recombinases have been described, and their genes can be used in 

25 accordance with the method of the present invention. Such recombinases include the Int 
recombinase of bacteriophage X (with or without Xis) (Weisberg et. al, 1983), herein 
incorporated by reference); Tpnl and the P-lactamase transposons (Mercier et al, 1990); 
the Tn3 resolvase (Flanagan and Fennewald, 1989; Stark et al, 1989); the yeast 
recombinases (Matsuzaki et al, 1990); the B. subtilis SporVC recombinase (Sato et al, 

30 1990); the Flp recombinase (Schwartz and Sadowski, 1989; Parsons et al, 1990; Golic 
and Lindquist, 1989; Amin et al, 1990); the Hin recombinase (Glasgow et al, 1989); 
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immunoglobulin recombinases (Malynn et al, 1988); and the Cin recombinase (Haffter 
and Bickle, 1988; Hubner et al, 1989), all herein incorporated by reference. Such 
systems are discussed (Echols, 1990; de Villartay, 1988; Craig, 1988; Poyart-Salmeron et 
al, 1989; Hunger-Bertling et al, 1990; and Cregg and Madden, 1989), all herein 
5 incorporated by reference. 

Of particular interest in the present invention is the Cre recombinase. Cre has 
been purified to homogeneity, and its reaction with the loxP site has been extensively 
characterized (Abremski and Hess, 1984), herein incorporated by reference). Cre protein 
has a molecular weight of 35,000 and can be obtained commercially from New England 

10 Nuclear/DuPont. The cre gene (which encodes the Cre protein) has been cloned and 
expressed (Abremski et al, 1983), herein incorporated by reference). The Cre protein 
mediates recombination between two loxP sequences (Sternberg et al, 1981), which may 
be present on the same or different DNA molecule. Because the internal spacer sequence 
of the loxP site is asymmetrical, two loxP sites can exhibit directionality relative to one 

15 another (Hoess and Abremski, 1984). Thus, when two sites on the same DNA molecule 
are in a directly repeated orientation, Cre will excise the DNA between the sites 
(Abremski et al, 1983). However, if the sites are inverted with respect to each other, the 
DNA between them is not excised after recombination but is simply inverted. Thus, a 
circular DNA molecule having two loxP sites in direct orientation will recombine to 

20 produce two smaller circles, whereas circular molecules having two loxP sites in an 
inverted orientation simply invert the DNA sequences flanked by the loxP sites. In 
addition, recombinase action can result in reciprocal exchange of regions distal to the 
target site when targets are present on separate DNA molecules. 

Recombinases have important application for characterizing gene function in 

25 knockout models. When the constructs described herein are used to disrupt limulus 
clotting factor protease-like genes, a fusion transcript can be produced when insertion of 
the positive selection marker occurs downstream (3') of the translation initiation site of 
the limulus clotting factor protease-like gene. The fusion transcript could result in some 
level of protein expression with unknown consequence. It has been suggested that 

30 insertion of a positive selection marker gene can affect the expression of nearby genes. 
These effects may make it difficult to determine gene function after a knockout event 
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since one could not discern whether a given phenotype is associated with the inactivation 
of a gene, or the transcription of nearby genes. Both potential problems are solved by 
exploiting recombinase activity. When the positive selection marker is flanked by 
recombinase sites in the same orientation, the addition of the corresponding recombinase 
5 will result in the removal of the positive selection marker. In this way, effects caused by 
the positive selection marker or expression of fusion transcripts are avoided. 

III. Proteinaceous Compositions 

In certain embodiments, the present invention concerns novel compositions 
comprising at least one proteinaceous molecule, such as s-SHIPl, SHIP1, or a modulator 

10 of an s-shipl promoter. As used herein, a "proteinaceous molecule," "proteinaceous 
composition," "proteinaceous compound," "proteinaceous chain" or "proteinaceous 
material" generally refers, but is not limited to, a protein of greater than about 200 amino 
acids or the full length endogenous sequence translated from a gene; a polypeptide of 
greater than about 100 amino acids; and/or a peptide of from about 3 to about 100 amino 

15 acids. All the "proteinaceous" terms described above may be used interchangeably 
herein. 

In certain embodiments the size of the at least one proteinaceous molecule may 
comprise, but is not limited to, about 5, about 6, about 7, about 8, about 9, about 10, 
about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, 

20 about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, 
about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, 
about 38, about 39, about 40, about 41, about 42, about 43, about 44, about 45, about 46, 
about 47, about 48, about 49, about 50, about 51, about 52, about 53, about 54, about 55, 
about 56, about 57, about 58, about 59, about 60, about 61, about 62, about 63, about 64, 

25 about 65, about 66, about 67, about 68, about 69, about 70, about 71, about 72, about 73, 
about 74, about 75, about 76, about 77, about 78, about 79, about 80, about 81, about 82, 
about 83, about 84, about 85, about 86, about 87, about 88, about 89, about 90, about 91, 
about 92, about 93, about 94, about 95, about 96, about 97, about 98, about 99, about 100, 
about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, 

30 about 190, about 200, about 210, about 220, about 230, about 240, about 250, about 275, 
about 300, about 325, about 350, about 375, about 400, about 425, about 450, about 475, 
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about 500, about 525, about 550, about 575, about 600, about 625, about 650, about 675, 
about 700, about 725, about 750, about 775, about 800, about 825, about 850, about 875, 
about 900, about 925, about 950, about 975, about 1000, about 1100, about 1200, about 
1300, about 1400, about 1500, about 1750, about 2000, about 2250, about 2500 or greater 
5 amino molecule residues, and any range derivable therein. 

As used herein, an "amino molecule" refers to any amino acid, amino acid 
derivative or amino acid mimic as would be known to one of ordinary skill in the art. In 
certain embodiments, the residues of the proteinaceous molecule are sequential, without 
any non-amino molecule interrupting the sequence of amino molecule residues. In other 

10 embodiments, the sequence may comprise one or more non-amino molecule moieties. In 
particular embodiments, the sequence of residues of the proteinaceous molecule may be 
interrupted by one or more non-amino molecule moieties. 

Accordingly, the term "proteinaceous composition" encompasses amino molecule 
sequences comprising at least one of the 20 common amino acids in naturally synthesized 

15 proteins, or at least one modified or unusual amino acid. 

Proteinaceous compositions may be made by any technique known to those of 
skill in the art, including the expression of proteins, polypeptides or peptides through 
standard molecular biological techniques, the isolation of proteinaceous compounds from 
natural sources, or the chemical synthesis of proteinaceous materials. The nucleotide 

20 and protein, polypeptide and peptide sequences for various genes have been previously 
disclosed, and may be found at computerized databases known to those of ordinary skill 
in the art. One such database is the National Center for Biotechnology Information's 
Genbank and GenPept databases (http://www.ncbi.nlm.nih.gov/). The coding regions for 
these known genes may be amplified and/or expressed using the techniques disclosed 

25 herein or as would be know to those of ordinary skill in the art. Alternatively, various 
commercial preparations of proteins, polypeptides and peptides are known to those of 
skill in the art. 

In certain embodiments a proteinaceous compound may be purified. Generally, 
"purified" will refer to a specific or protein, polypeptide, or peptide composition that has 
30 been subjected to fractionation to remove various other proteins, polypeptides, or 
peptides, and which composition substantially retains its activity, as may be assessed, for 
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example, by the protein assays, as would be known to one of ordinary skill in the art for 
the specific or desired protein, polypeptide or peptide. 

It is contemplated that virtually any protein, polypeptide or peptide containing 
component may be used in the compositions and methods disclosed herein. However, it 
5 is preferred that the proteinaceous material is biocompatible. 

IV. Therapeutic Applications 

The invention is widely applicable to a variety of situations where it is desirable 
to be able to regulate the level of gene expression, such as by turning gene expression 
"on" and "off, in a rapid, efficient and controlled manner without causing pleiotropic 

10 effects or cytotoxicity. The invention may be particularly useful for gene therapy 
purposes in humans, in treatments for either genetic or acquired diseases. The general 
approach of gene therapy involves the introduction of one or more nucleic acid molecules 
into cells such that one or more gene products encoded by the introduced genetic material 
are produced in the cells to restore or enhance a functional activity. For reviews on gene 

15 therapy approaches Anderson, et al. (1992; Miller et al. (1992); Friedmann et al. (1989); 
and Coumoyer et al. (1990). However, current gene therapy vectors typically utilize 
constitutive regulatory elements which are responsive to endogenous transcriptions 
factors. These vector systems do not allow for the ability to modulate the level of gene 
expression in a subject. In contrast, the regulatory system of the invention provides this 

20 ability. 

To use the system of the invention for gene therapy purposes, at least one DNA 
molecule is introduced into cells of a subject in need of gene therapy {e.g., a human 
subject suffering from a genetic or acquired disease) to modify the cells. The cells are 
modified to comprise: 1) nucleic acid encoding an inducible regulator of the invention in 

25 a form suitable for expression of the inducible regulator in the host cells; and 2) an 
siRNA (e.g., for therapeutic purposes) operatively linked to a tissue-specific promoter 
such as an s-shipl promoter. A single DNA molecule encoding components of the 
regulatory system of the invention can be used, or alternatively, separate DNA molecules 
encoding each component can be used. The cells of the subject can be modified ex vivo 

30 and then introduced into the subject or the cells can be directly modified in vivo by 
conventional techniques for introducing nucleic acid into cells. Thus, the regulatory 
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system of the invention offers the advantage over constitutive regulatory systems of 
allowing for modulation of the level of gene expression depending upon the requirements 
of the therapeutic situation. 

Genes of particular interest to be knocked down or knocked out in cells of a 
5 subject for treatment of genetic or acquired diseases include those encoding a deleterious 
gene product, such as an abnormal protein. Examples of non-limiting specific diseases 
include anemia, blood-related cancers, Parkinson's disease, and diabetes. 

The present invention can be applied to develop autologous or allogeneic cell 
lines for therapeutical purposes. For example, gene therapy applications of particular 

10 interest in cell and/or organ transplantation are utilized with the present invention. In 
exemplary embodiments, downregulation of transplantation antigens (such as, for 
example, by downregulation of beta2 -microglobulin expression via siRNA) allows for 
transplantation of allogeneic cells while minimizing the risk of rejection by the patient's 
immune system. The present invention would allow for a switch off of the RNAi in case 

15 of adverse effects (e.g. uncontrollable replication of the transplanted cells). 

Cells types that can be subjected to the present invention include hematopoietic 
stem cells, myoblasts, hepatocytes, lymphocytes, airway epithelium, skin epithelium, 
islets, dopaminergic neurons, keratinocytes, and so forth. For further descriptions of cell 
types, genes and methods for gene therapy see e.g., Wilson et al. (1988); Armentano et 

20 al. (1990); Wolff ef al. (1990); Chowdhury et al. (1991); Ferry et al. (1991); Wilson et al. 
(1992); Quantin et al. (1992); Dai et al. (1992); van Beusechem et al. (1992); Rosenfeld 
et al. (1992); Kay et al. (1992); Cristiano et al. (1993); Hwu et al. (1993); and Herz and 
Gerard (1993). 

In particular embodiments of the present invention, there is a method of treating 
25 any disease condition amenable to treatment with an s-ship promoter. In specific 
embodiments, the method comprises preparing a polynucleotide construct having a 
region encoding a therapeutic or diagnostic (marker) gene that is operably linked to an an 
s-ship promoter, wherein the gene encoded by the construct is for the treatment of the 
disease condition. 
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A. Pharmaceutical Formulations, Delivery, and Treatment Regimens 

In an embodiment of the present invention, methods of treatment are 
contemplated. An effective amount of the pharmaceutical composition, generally, is 
defined as that amount sufficient to detectably and repeatedly to ameliorate, reduce, 
5 minimize or limit the extent of the disease or its symptoms. More rigorous definitions 
may apply, including elimination, eradication or cure of disease. 

The routes of administration will vary, naturally, with the location and nature of 
the lesion, and include, e.g., intradermal, transdermal, parenteral, intravenous, 
intramuscular, intranasal, subcutaneous, percutaneous, intratracheal, intraperitoneal, 

10 intratumoral, perfusion, lavage, direct injection, and oral administration and formulation. 

Solutions of the active compounds as free base or pharmacologically acceptable 
salts may be prepared in water suitably mixed with a surfactant, such as 
hydroxypropylcellulose. Dispersions may also be prepared in glycerol, liquid 
polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of 

15 storage and use, these preparations contain a preservative to prevent the growth of 
microorganisms. The pharmaceutical forms suitable for injectable use include sterile 
aqueous solutions or dispersions and sterile powders for the extemporaneous preparation 
of sterile injectable solutions or dispersions (U.S. Patent 5,466,468, specifically 
incorporated herein by reference in its entirety). In all cases the form must be sterile and 

20 must be fluid to the extent that easy syringability exists. It must be stable under the 
conditions of manufacture and storage and must be preserved against the contaminating 
action of microorganisms, such as bacteria and fungi. The carrier can be a solvent or 
dispersion medium containing, for example, water, ethanol, polyol (e.g., glycerol, 
propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, 

25 and/or vegetable oils. Proper fluidity may be maintained, for example, by the use of a 
coating, such as lecithin, by the maintenance of the required particle size in the case of 
dispersion and by the use of surfactants. The prevention of the action of microorganisms 
can be brought about by various antibacterial and antifungal agents, for example, 
parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In many cases, it 

30 will be preferable to include isotonic agents, for example, sugars or sodium chloride. 
Prolonged absorption of the injectable compositions can be brought about by the use in 
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the compositions of agents delaying absorption, for example, aluminum monostearate and 
gelatin. 

For parenteral administration in an aqueous solution, for example, the solution 
should be suitably buffered if necessary and the liquid diluent first rendered isotonic with 
5 sufficient saline or glucose. These particular aqueous solutions are especially suitable for 
intravenous, intramuscular, subcutaneous, intratumoral and intraperitoneal 
administration. In this connection, sterile aqueous media that can be employed will be 
known to those of skill in the art in light of the present disclosure. For example, one 
dosage may be dissolved in 1 ml of isotonic NaCl solution and either added to 1000 ml of 

10 hypodermoclysis fluid or injected at the proposed site of infusion, (see for example, 
"Remington's Pharmaceutical Sciences" 15th Edition, pages 1035-1038 and 1570-1580). 
Some variation in dosage will necessarily occur depending on the condition of the subject 
being treated. The person responsible for administration will, in any event, determine the 
appropriate dose for the individual subject. Moreover, for human administration, 

15 preparations should meet sterility, pyrogenicity, general safety and purity standards as 
required by FDA Office of Biologies standards. 

Sterile injectable solutions are prepared by incorporating the active compounds in 
the required amount in the appropriate solvent with various of the other ingredients 
enumerated above, as required, followed by filtered sterilization. Generally, dispersions 

20 are prepared by incorporating the various sterilized active ingredients into a sterile 
vehicle which contains the basic dispersion medium and the required other ingredients 
from those enumerated above. In the case of sterile powders for the preparation of sterile 
injectable solutions, the preferred methods of preparation are vacuum-drying and freeze- 
drying techniques which yield a powder of the active ingredient plus any additional 

25 desired ingredient from a previously sterile-filtered solution thereof. 

The compositions disclosed herein may be formulated in a neutral or salt form. 
Pharmaceutically-acceptable salts, include the acid addition salts (formed with the free 
amino groups of the protein) and which are formed with inorganic acids such as, for 
example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, 

30 tartaric, mandelic, and the like. Salts formed with the free carboxyl groups can also be 
derived from inorganic bases such as, for example, sodium, potassium, ammonium, 
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calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 
histidine, procaine and the like. Upon formulation, solutions will be administered in a 
manner compatible with the dosage formulation and in such amount as is therapeutically 
effective. The formulations are easily administered in a variety of dosage forms such as 
5 injectable solutions, drug release capsules and the like. 

As used herein, "carrier" includes any and all solvents, dispersion media, 
vehicles, coatings, diluents, antibacterial and antifungal agents, isotonic and absorption 
delaying agents, buffers, carrier solutions, suspensions, colloids, and the like. The use of 
such media and agents for pharmaceutical active substances is well known in the art. 

10 Except insofar as any conventional media or agent is incompatible with the active 
ingredient, its use in the therapeutic compositions is contemplated. Supplementary active 
ingredients can also be incorporated into the compositions. 

The phrase "pharmaceutically-acceptable" or "pharmacologically-acceptable" 
refers to molecular entities and compositions that do not produce an allergic or similar 

15 untoward reaction when administered to a human. The preparation of an aqueous 
composition that contains a protein as an active ingredient is well understood in the art. 
Typically, such compositions are prepared as injectables, either as liquid solutions or 
suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection 
can also be prepared. 

20 B. Combination Treatments 

The compounds and methods of the present invention may be used in the context 
of traditional therapies. In order to increase the effectiveness of a treatment with the 
compositions of the present invention, it may be desirable to combine these compositions 
with other agents effective in the treatment of those diseases and conditions. For 

25 example, the treatment of a cancer may be implemented with therapeutic compounds of 
the present invention and other anti-cancer therapies, such as anti-cancer agents or 
surgery. Likewise, the treatment of a vascular disease or condition may involve both the 
present invention and conventional vascular agents or therapies. 

Various combinations may be employed; for example, a host cell of the present 

30 invention is "A" and the secondary anti-cancer agent/therapy is "B": 
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A/B/A B/A/B B/B/A A/A/B A/B/B B/A/A A/B/B/B B/A/B/B 



B/B/B/A B/B/A/B A/A/B/B A/B/A/B A/B/B/A B/B/A/A 

5 B/A/B/A B/ A/A/B A/A/A/B B/A/A/A A/B/A/A A/A/B/A 

Administration of the therapeutic expression constructs of the present invention to 
a patient will follow general protocols for the administration of that particular secondary 
therapy, taking into account the toxicity, if any, of the treatment. It is expected that the 
10 treatment cycles would be repeated as necessary. It also is contemplated that various 
standard therapies, as well as surgical intervention, may be applied in combination with 
the described therapy. 

V. EXAMPLES 

The following examples are included to demonstrate preferred embodiments of 
15 the invention. It should be appreciated by those of skill in the art that the techniques 
disclosed in the examples which follow represent techniques discovered by the inventor 
to function well in the practice of the invention, and thus can be considered to constitute 
preferred modes for its practice. However, those of skill in the art should, in light of the 
present disclosure, appreciate that many changes can be made in the specific 
20 embodiments which are disclosed and still obtain a like or similar result without 
departing from the spirit and scope of the invention. 

EXAMPLE 1: 
Materials and Methods 

Cell growth and transfection conditions 

25 NIH3T3 cells, originally obtained from the American Type Culture Collection 

(ATCC, Rockville, Maryland), were grown in DMEM with 10% fetal bovine serum. The 
D3 embryonic stem (ES) cell line was obtained from Dr. Tasuku Honjo (Nakano et al, 
1994) and grown in high glucose DMEM (GBCO/Invitrogen Corp., #11965-092) 
supplemented with 2 mM L-glutamine, 1 mM sodium pyruvate, 0.1 mM nonessential 
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amino acids, 0.15 mM monothioglycerol (Sigma, M7522), and 15% fetal bovine serum 
(pre-tested for ES cell growth (HyClone Labs, Inc.)). D3 ES cells were routinely grown 
on a LIF-producing feeder layer of mitomycin C-treated (Nagy et al, 2003) SNL cells, 
obtained from Phil Soriano (FHCRC). The SNL cells are G418-resistant. Usually, one 
5 passage before flow cytometry, ES cell were transferred to gelatin(Sigma)-coated plates 
without a feeder layer and with LIF (ESGRO) added to the medium (1000 units/ml). 

DNA was transfected into D3 ES cells by electroportion essentially as described 
by Nagy et al, (2003). ES cells were suspended in PBS (Ca 2+ and Mg 2+ - free) at 1 x 10 6 
cells/ml and 0.8 ml of the cell suspension placed in a 0.4-cm-wide electrode-gap sterile 

10 cuvette (BIO-RAD). Plasmid DNA (20 ug), linearized by overnight digestion with Afl II 
and Qiagen-purified, was added and mixed. Two pulses (instead of one as recommended) 
of current were applied to the cells in the cuvette employing settings of 500mF, and 230V 
on a BIO-RAD Gene-Pulser™ with Capacitance Extender. After 5 min on ice, the 
viscous solution was transferred to a 10-cm culture dish containing mitomycin C-treated 

15 SNL cells. After 24 hr, G418 selection was begun using 280 ug/ml active G418. Cells 
were passed after 10-14 days onto gelatin-coated plates (no feeder cells) in LF 
containing medium with G418. Flow cytometry was performed 3-4 days later. 

Afl Il-linearized plasmid DNA (10 ug) was introduced into NIH3T3 cells by 
transfection using Superfect reagent (Qiagen) as recommended by the manufacturer. 
20 G418 selection was begun 24 hr after transfection using 400 ug/ml G418. Cells were 
passaged twice in G418 before flow cytometry. Regardless of the electroporation into ES 
cells or transfection into the NIH3T3 cells, abundant G418 resistant colonies were 
obtained for each cell type. 

Two positive control GFP-expression plasmids were used for both NIH3T3 cells 
25 and the D3 ES cells to be sure the transfection/electroporation steps were functional and 
that GFP expression occurred in each experiment. These positive controls also helped set 
the gates for analyses of GFP -expressing cells. These two plasmids were the pIRES2- 
GFP empty plasmid (BD Biosciences Clontech) and pIRES2-GFP containing an insert 
encoding the Capn5 gene. Both plasmids expressed equally well in each cell type, and the 
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empty pIRES2-GFP vector always expressed higher levels of GFP than the one 
containing the insert. 

Immunoblotting analysis for SHIP proteins 

The techniques for cell extraction, electrophoresis, and immunoblotting have been 
5 described previously (Liu et al., 21 10). Equal amounts of protein extracts from each cell 
type were loaded for gel electrophoresis. SHIP proteins were detected using monoclonal 
antibody P2C6 at a 1:1000 dilution ( Lucas and Rohrschneider, 1999). 

Flow cytometry 

Cells were examined for GFP expression on a Caliber II bench-top analyzer. 
10 Cytometer setting were established using positive FDC-P1 cells expressing GFP from a 
retroviral vector and negative cells, not transfected, or transfected with an empty plasmid. 
At least 10 4 cells were analyzed for each plasmid transfected, and two independent 
transfections were examined. Both transfections gave similar results, and the results of 
one experiment are shown. 

15 Construction of promoter-less GFP-expression constructs for analysis of s-SHIP 
intron-5 promoter activity 

A 7.6-kb DNA Sac l-Sac I fragment from a Lambda 129Sv mouse genomic clone 
(Wolf et al, 2000, NCBI accession #AF235499, hereby incorporated by reference) was 
used for initial examination of potential tissue-specific promoter activity. This region 
20 contained almost all of intron-5, the 88 bp of exon-6, and 1271 bp extending into intron- 
6. This 7.6-kb segment was cloned into pBluescript KS (Stratagene), and sub-segments 
of the region were obtained with the restriction sites shown in FIG. 2. These sub- 
segments were cloned into a promoter-less GFP-expression construct. 

The promoter-less GFP-expression construct was made from the pEGFP-1 
25 plasmid (BD Biosciences Clontech) by modifications of the MCS (multiple cloning site), 
incorporating additional synthesized cloning sites (EcoRI-AccI(up)-BssHII-Nhel-Pstl) 
for insertion of the sub-fragments from the 7.6 kb intron-5 clone. Both AccI and BssHII 
recognize multiple sequences and the nucleotide sequence in the synthesized DNA 
corresponded to AccI site at nucleotide 2776 of the 7.6-kb region, and the 5' BssHII site 
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of the pBluescript plasmid, respectively. In addition, prior to incorporation of the 
extended MCS, the SV40 early and late introns from pCMVp were inserted at the 3' end 
of the MCS between the Kpnl and Agel sites. Two intron cassettes were used: one 
containing only the splice acceptor site from the long intron, and a second containing 
5 both early and late introns. The former was used only for inserts (e.g., the 7.6-kb and 
4.2-kb inserts) containing an intact exon 6 with its splice donor site. The two final 
plasmids each containing the extended MCS and either the late SV40 intron only 
(pEGFP2-SD3-l), or both SV40 introns (pEGFP2-SDl-2), were sequenced through the 
inserted intron region and one of each with correct sequence selected for inserting the 
10 7.6-kb clone and sub-regions. 

The longest promoter construct contained the entire 7.6-kb putative s-SHIP 
promoter region, and was excised from the pBluescript plasmid with BssHII for insertion 
into the MCS of the pEGFP-SD3-l plasmid. The 6.3-kb fragment was obtained with a 
partial PstI digestion and complete BssHII digestion. The 4.4-kb and 4.2-kb fragments 

15 were from derived from PstI and AccI digestions, respectively. The 1.9-kb segment was 
obtained from digestion of the 4.4 kb fragment with Nhel. The smallest 0.96 kb region 
was produced by deleting a region of the pBluescript 7.6 kb clone from the Swal site 960 
nucleotides 5' of exon 6, to the Fbal site 22 nucleotides from the 5' end of the 7.6 kb 
clone. After ligation, the fragment from the 5' BssHII site to the PstI site was excised. 

20 Each fragment was inserted into their respective restriction sites of the extended MCS. 
Restriction analysis of each purified plasmid confirmed the correct insert in the correct 
orientation, and all cloning junctions were sequenced to confirm proper ligation. Each 
plasmid was linearized with Aflll , and Qiagen purified from agarose gels before 
electroporation or transfection. 

25 Construction of the 1 1.5kb- and 6.2kb-GFP s-SHIP promoter transgenes 

The 11.5kb-GFP transgenic construct was prepared from two separate plasmids 
containing the two halves of the proposed s-SHIP promoter region, plus an 833 nt 
sequence from a lambda genomic clone, which was inserted between these two halves. 
The genomic organization of SHIP1 is shown in Wolf et al. (2000). The starting 
30 genomic clone contained a 4 kb region from the Sad site near the 3' end of the 7.6 kb 
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genomic clone in intron 6, extending through exon 8 and into intron 8. This SacI-SacI 
fragment was cloned into the Sad site of pBluescript SK (pBSK). The GFP gene from 
pEGFP-1 (Invitrogen/Clontech) was excised with Ncol (encompassing the ATG 
translation start site of GFP) and Sspl. This was ligated into the Ncol (the putative s- 
5 SHIP translation start site in exon 7) and EcoRV sites of the pBSK-4kb clone. Next, the 
5' half of the genomic promoter was added in the form of the SacI-SacI 7.6 kb genomic 
sub-clone. This was inserted into the one remaining Sad site at the 5' end of the intron 6- 
exon 7-GFP clone in pBSK. This left a gap of 0.9 kb between the two SacI sites in intron 
6 (see Wolf et al, 2000). This region was recovered as a larger BsiWI-EcoPJ 2117 nt 
10 fragment, whose sequence demonstrated the insertion of 833 nucleotides between two 
SacI sites. Therefore, this BsiWI-EcoRI fragment was inserted into the same unique sites 
of the transgenic construct to produce the finished 1 1.5kb-GFP transgene in pBSK. 

The 6.2kb-GFP transgene-construct was prepared from the 1 1.5kb-GFP transgene 
prior to the insertion of the 833 nt at the intron 6 SacI site. This 1 1.5kb(A833)-GFP 
15 construct was digested with Fbal and Swal, removing 5.3 kb from the 5' end of intron 5. 
Re-ligation removed all but 19 intron 5 nt at the 5' end of the 11.5kb-GFP transgene. 
Both 11.5kb-GFP and 6.2kb-GFP transgenes, in pBSK, were cut from the plasmid with 
BssHII and Qiagen purified from an agarose gel for introduction into the mouse genome. 

Production of transgenic mice 

20 Founder transgenic mice were prepared in our Transgenic Mouse Facility by 

pronuclear injection of fertilized zygotes from (C57B1/6 female X CBA/J male) Fl mice. 
Mice, positive for the transgene, were screened by PCR using DNA obtained from tails or 
toes of young animals. The location of the primer set for PCR is shown in FIG. 3: the 
upstream primer (a) is within intron 6 (Pro-up2, 5'- 

25 TACTCCTCAGCAAGAGTAGCTGG-3 ')(SEQ ID NO:XX), and the downstream primer 
(b) within the GFP gene (GFP-dnl, 5 '-GCTGAACTTGTGGCCGTTTACGT-3 ')(SEQ 
ID NO:XX) produce a 632 nucleotide (nt) product. These primers were used for detection 
of both 6.2kb-GFP and 11.5kb-GFP transgenic mice. Positive chimeric mice were bred 
to C57B1/6 mice and four founder lines (A, B, C and D) obtained for the 11.5kb-GFP 

30 mice. Later analyses demonstrated that. founder line B was not positive for GFP 
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expression, even though the primer pair a and b gave a positive 632 nt product. 
Therefore, line B is not included in further analyses. The other lines were maintained by 
breeding transgene-positive animals with wild-type C57B1/6 mice. For some experiments 
transgene-positive offspring were generated from positive intra-line breeding. Two 
5 founder animals were obtained for the 6.2kb-GFP transgene but one was lost. 

The transgene copy number in each founder line (except 1 1.5kb-GFP, line B) was 
determined by semi-quantitative RT-PCR of transgene expression relative to endogenous 
Gab2 expression. Primers for detecting genomic gab2 are: E4F, 5'- 
CTTCTATAGCCTTCCC AAGCC-3 ' (SEQ ID NO:XX); E5R, 5'- 
1 0 CTCGTAGGTCTCACAGGAAG-3 ' (SEQ ID NO:XX). 

Analysis of embryos 

Preimplantation embryos were harvested at 2.5 and 3.5 dpc from uterine horns of 
pregnant females [see Nagy et al., (2003) for details of these methods]. The morulae and 
blastocysts were washed in RPMI 1640 medium (Gibco) containing 10% fetal bovine 
15 serum, transferred to PBS (Ca 2+ and Mg 2+ ), and GFP-expression or phase images 
photographed on a Nikon Eclipse TE200 inverted microscope coupled to a Roper 
Scientific lkxlk pixel digital camera. Images were captured with MetaMorph software 
and prepared for publication with Photoshop (Adobe). High-resolution z-sections of GFP 
expression within embryos were made with a Leica TCS SP Confocal microscope. 

20 Several blastocysts were plated onto gelatin-coated tissue-culture wells in DME 

10% fetal bovine serum, and photographed three days later. During this period, 
blastocysts hatched from the zona pellucida, and attached to the culture plate. The 
attached mass of trophectoderm cells with the non-adherent ICM was photographed for 
GFP and phase with a Nikon Eclipse TE200 microscope. 

25 RT-PCR analysis of s-SHIP expression in blastocysts 

mRNA was isolated from wild-type 3.5 dpc blastocysts, FDC-P1 cells and the D3 
ES cells using a Dynabeads mRNA DIRECT micro kit (Dynal). Reverse transcription 
used the Sensiscript kit from Qiagen, and the PCR cycling conditions were as follows: 
94°C 1 min, [94°C 15 sec, 68°C 2 min] x 30 cycles, 68°C 5 min, and a 4°C hold. Each 
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reaction used the equivalent of 1.5 ng mRNA, based on the concentration before reverse 
transcription. Primers pairs were: 

HPRT-upl, 5 '-CCTGCTGGATTACATTAAAGC ACTG-3 ' (SEQ ID NO:XX), 
HPRT-downl 5 '-GTCAAGGGCATATCCAACAACAAAC-3 ' (SEQ ID NO:XX); 

5 

OCT4-Upl 5 ' -GGCGTTCTCTTTGGAAAGGTGTTC-3 ' (SEQ ID NO:XX), 
OCT4-Downl 5'-CTCGAACCACATCCTTCTCT-3' (SEQ ID NO:XX); 

SHIPl/s-SHIP pair #3, 
1 0 SHIP-E8FW, 5 ' -TTGCTGC ACG AGGGCTC AGAATC-3 ' (SEQ ID NO:XX), 
SSP883RV, 5 ' -TCCGATTCTCATGCTCTGGCTTG-3 ' (SEQ ID NO:XX); 

SHIPl/s-SHIP pair #4, 

SP2109FW, 5 ' -C AGCCCTGTCTTTGCC ACGTTTG-3 ' (SEQ ID NO:XX), 
15 SP2637RV, 5'-TCCACTGGATTCATCCCGCTCTG-3' (SEQ ID NO:XX); 

SHIPl/s-SHIP pair #5, 

newfw, 5'-CTTCCTCTTGCAACAGAGAACCC-3' (SEQ ID NO:XX), 
newrv, 5 ' - ACTC AACGTCC ACTTTG AGATGC-3 ' (SEQ ID NO:XX). 

20 

EXAMPLE 2: 

Identification and Characterization of the s-SHIP Promoter 

Potential s-SHIP promoter activity was first analyzed in cell lines grown in 
culture. Several cell lines were tested for s-SHIP vs. SHIP1 protein expression, based on 

25 the known and expected expression pattern of the s-SHIP protein (Lioubin et al, 1994; 
Tu et al, 2001). These results showed the expression of the ~104-kDa s-SHIP only in the 
ES cells, whereas the 145-kDa SHIP1 product was exclusively expressed in the maturing 
FD-Fms myeloid cells. Hot SDS-extraction of the ES cells did not change the size of the 
s-SHIP protein, suggesting that this 104-kDa product is not the result of proteolytic 

30 degradation during extraction (Horn et al, 2001). SHIP proteins were not detectable in 
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NIH3T3 fibroblasts, the SNL cells serving as feeder for the ES cell growth, or the 293 
human kidney cells. Therefore, NTH3T3 cells and D3 ES cells were selected as negative 
and positive cells, respectively, for analysis of the potential s-SHIP promoter activity. 

A 7.6-kb genomic ship! region containing the intron-5 region was obtained for 
5 initial promoter analysis. The entire 7.6-kb region and sub-fragments thereof were cloned 
into a promoter-less GFP (enhanced green-fluorescent protein) expression vector (FIG. 
1). Promoter activity of the intron-5 region was then assayed in the cells positive for s- 
SHIP expression (embryonic stem cells, clone D3) vs. cells negative for s-SHIP 
expression (NIH3T3 cells). The expression of GFP in each cell type, assayed by flow 

10 cytometry, was a measure of the promoter activity within each fragment of the 7.6 kb 
genomic DNA. The results indicated that, whereas, empty vectors alone lacked 
significant promoter activity in either cell type, vectors containing intron-5 segments 
exhibited substantial expression in the D3 ES cells but not in the NIH3T3 cells. Segments 
of intron 5, ranging from 0.96 kb to 7.6 kb were active for GFP expression in the ES 

15 cells; however, the shorter segments appeared most active. Two fragments of 1.9 kb and 
0.96 kb, immediately upstream of exon 6, each exhibited equally high GFP expression. 
The shortest insert fragment contained part of exon 6, but only the 44 nucleotides 
upstream of exon 6, (Tu et ai, 2001), and was completely without promoter activity. 
These results strongly suggest that the intron-5 region of genomic ship! contains cell- 

20 specific promoter activity, and segments more distal to exon 6 may have negative 
regulatory activity. 

Based on the ES/NIH3T3 cell-transfection experiments, two new constructs with 
an extended region downstream of the intron-5 genomic area were prepared for in vivo 
analysis of promoter activity in transgenic mice (FIG. 3A). Transgenic mice were 

25 produced for in vivo examination of the putative s-SHIP promoter/enhancer activity, and 
determining the overall expression pattern of the transgene, and presumably s-SHIP 
protein. The promoter in the longer of the new constructs (the 11.5kb-GFP transgene) 
contained the entire intron 5 from the above 7.6-kb genomic fragment, plus all of exon 6, 
intron 6, and the portion of exon 7 ending at the theoretical ATG start site (Kozak, 1987) 

30 for the s-SHIP protein translation. This start site was fused, in frame, to the ATG for the 
GFP protein. All of intron 6 and part of exon 7 were included in this construct because, 1) 
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the construct might then more closely resemble the endogenous promoter, 2) splicing 
may be important for efficient expression (Nott et al, 2004), and 3) positive or negative 
regulatory elements for expression may also reside within this sequence. The second, 
shorter, transgenic promoter construct (the 6.2kb-GFP transgene) was similar, but 
5 contained only 0.96 kb of intron 5 sequence adjacent to exon 6, and also lacked 833 
nucleotides between two SacI sites within intron 6. Thus, if either construct contained 
promoter activity in vivo, transcription would start within intron 5, while intron 6 would 
be spliced out and translation of GFP would begin at the first ATG within an appropriate 
Kozak site. 

10 Transgenic (Tg) mice were then produced in the Hutchinson Center Transgenic 

Mouse facility and chimera animals screened for each transgene by PCR. Breeding each 
founder to wild-type C57B1/6 mice yielded four lines containing the 11.5kb-GFP 
transgene, and one line with the 6.2kb-GFP transgene. Of the four founder Tgll.5 kb- 
GFP mice, one was negative for expression of the transgene (line B), while three were 

15 positive and each has exhibited the same expression patterns (lines A, C and D). Copy 
numbers of genomic transgenes, measured relative to the endogenous gab2 gene are 
shown in FIG. 3B. Within the three GFP-expressing 1 1.5kb-GFP founder mice, empirical 
results indicate that line C exhibits the noticeably highest GFP expression levels. Line C 
mice also exhibit lower birth rates with in utero death at 8.5-9.5 days postcoitum (dpc) 

20 apparent. The single 6.2kb-GFP founder line harbors the most transgene copies, but no 
overt defects in the physical appearance of these mice, their birth rate or development 
have been observed. 

Experiments were then conducted with the adult transgenic 11.5kb-GFP mice to 
examine transgene expression; however, it was difficult initially to find any GFP 

25 expressed in these mice by flow cytometry of blood and stem cell enriched bone marrow. 
After several negative attempts to find GFP expression, it was reasoned that because ES 
cell expression was readily detectable in the initial ES cell experiments, the best test for 
in vivo expression would be the inner cell mass (ICM) of the blastocyst, from which ES 
cells can be derived. Therefore, we looked for GFP expression in 3. 5-dpc blastocysts 

30 derived from mating of Tg males x WT females. Blastocysts derived from one such cross 
produced 9 GFP-positive embryos indicating that the Tg was homozygous for the 
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transgene. A separate Tg male bred to a WT female produced both positive and negative 
blastocysts. GFP-positive morulae were also obtained from similar crosses; whereas, 
blastocysts or morulae from WT parents were negative for GFP. 

Blastocysts are composed of 2-3 cell types depending on their developmental 
5 stage. The outer trophectoderm layer of cells surrounds the eccentric inner cell mass 
(ICM), destined to become the embryo proper, and later stage blastocysts also contain 
endodermal cells separating the ICM from the blastocoel cavity (Nagy et ah, 2003). To 
obtain a better idea of which cells of the blastocyst express the GFP transgene, transgenic 
3.5-dpc blastocysts were allowed to adhere to a culture dish by three days growth in 
10 DME 10% FBS. Under these conditions, the zona pellucida is shed, and the outer 
trophectoderm cells of the blastocyst form an adherent layer while the ICM remains as an 
unspread mass, and each is distinguishable morphologically from the other. The results 
showed that the ICM portion of the blastocyst retained the GFP expression while the 
adherent trophectoderm cells were largely GFP-negative. 

15 A more detailed picture of GFP expression throughout the intact early pre- 

implantation embryos was seen in confocal Z-sections of GFP within transgenic 2.5-dpc 
morulae and 3.5-dpc blastocysts. All cells of the 16 to 32-cell morula were GFP-positive. 
Transition of the morula to the early blastocyst is marked by the formation of the 
blastocoel cavity. A few cells of this early blastocyst structure began to shut-off GFP 

20 expression, and the extent of this GFP shut-off was more evident in the late blastocyst. 
Here, the outer trophectoderm cells had noticeably lower GFP expression, and the GFP- 
positive cells were confined to the ICM. Endodermal cells were not readily apparent. In 
these images, it is helpful to remember that the half-life of the GFP fluorescence is 
greater than 24 hr (Tech. Borchure, BD Bioscience ClonTech), and therefore cells, which 

25 have stopped expressing GFP, will retain some GFP protein and fluorescence for several 
days. Twenty-four hours separates the morula from the blastocyst stages; therefore, 
transgene shut-off early during this time would result in lower but not complete lack of 
GFP fluorescence late in this time span. The 11.5-kb transgene s-SHIP promoter 
contains the information for both cell-specific positive expression in morula and ICM of 

30 the blastocyst, but also cell-specific shut-off in trophectoderm cells. 
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Preimplantation embryos from the Tg6.2kb-GFP mice were analyzed next. The 
transgene in these mice contained only the proximal 0.96-kb region upstream of exon 6, 
which was necessary for GFP expression in the ES cells. It also lacked 833 nucleotides 
between two SacI sites of the intron-6 region. GFP expression in the 3.5-dpc blastocyst of 
5 the 6.2kb-GFP line was analyzed. Both qualitative and quantitative features of GFP 
expression in the Tg6.2kb-GFP blastocysts differed from those in the Tgll.5kb-GFP 
mice. First, GFP expression in the Tg6.2kb-GFP blastocysts was noticeably stronger (at 
least 5-fold) than that in the Tgll.5kb-GFP blastocysts, as measured by exposure times 
for obtaining equivalent GFP images in the Nikon digital microscope. Second, and more 
10 noticeable was the lack of GFP shut-off in the trophectoderm cells of the blastocyst. No 
clear demarcation in GFP expression was evident between ICM vs. trophectoderm as 
seen in the Tgl 1 .5kb-GFP blastocysts. 

Blastocysts from the Tg6.2kb-GFP mice were also allowed to adhere to culture 
plates and GFP expression was examined. Adherent blastocysts from Tgll.5kb-GFP 

15 mice were examined simultaneously. Adherent Tg6.2kb-GFP blastocyste expressed GFP 
in both ICM and trophectoderm cells in a, frequently, haphazard pattern. The Tgl 1.5kb- 
GFP adherent blastocysts expressed GFP only in the ICM as observed previously. A 
comparison of all embryos examined revealed that an increased GFP expression was 
apparent within the adherent Tg6.2kb-GFP blastocysts relative to the adherent Tgl 1.5kb- 

20 GFP blastocysts. These results were consistent with the promoter analyses performed in 
the ES cells (FIG. 1), and suggested that the lack of GFP shut-off by the 6.2kb-GFP 
transgene was due to negative regulatory information found in either one or both regions 
of the 11.5kb-GFP construct missing from the 6.2kb-GFP transgene. 

The data from Tu et al. (2001) and that presented herein demonstrated exclusive 
25 s-SHIP (rather than SHIP1) expression in ES cells, yet, even though ES cells are derived 
from the ICM of the blastocyst and the intron 5 s-SHIP promoter functioned well in the 
ICM, it was still not certain whether the ICM actually expressed s-SHIP in vivo. 
Consequently, s-SHIP mRNA expression was then analyzed by RT-PCR, compared to 
that of the universally expressed HPRT, and the ES cell and ICM-specific Oct4 
30 transcription factor. RNA from blastocysts, FDC-P1 myeloid progenitor cells, and D3 ES 
cells, was positive for HPRT as expected, and only the blastocysts and ES cells were 
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positive for Oct4. Initially, the s-SHIP-specific primers similar to those described by Tu 
et al. (2001) was used to test for s-SHIP expression; however, poor results were obtained. 
The forward primer in this set was moved 3' -ward into the region identical to SHIP1 but 
weak detection was still obtained. s-SHIP was therefore detected by "subtraction" using 
5 primers detecting both s-SHIP and SHIP1 products vs. primers detecting only the SHIP1 
product. These primers clearly demonstrated the presence of full-length SHIP1 only in 
the FDC-P1 cells, and s-SHDP in both blastocysts and ES cells. The weak detectability of 
s-SHIP may be due to poor hybridization of the primers, degradation of the 5' s-SHIP 
mRNA ends, or possibly an additional shorter transcription product from the ship] gene. 

10 Examination of the minimal 0.96-kb promoter proximal to exon 6 by 

Matlnspector indicated at several transcription-factor binding site potentially active in ES 
cells and the blastocyst ICM. FIG. 4 shows the first 600 nucleotides of this region 
upstream of exon 6, with potential transcription factor binding sites and motifs for 
transcriptional regulation marked. A transcription initiator sequence (Butler and 

15 Kadonaga, 2002) straddles the 5' end of the 44 nt SSR, suggesting a transcriptional start 
site. Paired GAT A, or Lmo2 binding sites are present, two overlapping p53 and Oct- 
binding sites, and a single extended FOX-factor binding region are prominent motifs. The 
Oct-binding motif is present in similar regions of both the murine and human s-SHIP 
promoter, suggesting such a factor could be important for ES and ICM expression. The 

20 POU factor Oct4 is expressed in ES cells and is part of an enhancer for ES cell-specific 
expression of target genes (Dailey et al, 1994). Therefore the Oct site could be part of a 
similar ES cell enhancer region. 

The transgene expression in preimplantation embryos raises a question about 
possible progenitor transgene expression in the oocytes or sperm of the adult, which then 

25 give rise to the fertilized embryo. The transcription factor, Oct4, is expressed in adult and 
embryonic germ cells, as well as the blastocyst ICM and in ES cells (Pesce et al, 1998). 
The possibility that the 11.5kb-GFP transgene could also be germ cell specific is even 
more likely given the prominent Oct4 binding motif within the 0.96 kb minimal promoter 
upstream of exon 6 (see FIG. 4). Therefore, ovaries and testes from 7-8 week old adult 

30 Tgll.5kb-GFP mice were harvested and frozen sections stained with Alexa 594-labeled 
phalloidin for visualizing tissue structure through polymerized actin staining, and 
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endogenous GFP expression. The results of this experiment demonstrated that neither the 
developing sperm of the testis, nor the developing oocytes of the ovarian follicles 
expressed GFP. Only blood vessels of the testes and ovaries exhibited specific GFP 
expression. Therefore, unlike the Oct4 transcription factor, the 11.5kb-GFP transgene is 
5 not a maternally activated gene, must be transcriptionally activated sometime after the 
germ cells leave the ovary/testis, and before the 2.5-dpc-morula stage of development. 

10 All of the compositions and/or methods and/or apparatus disclosed and claimed 

herein can be made and executed without undue experimentation in light of the present 
disclosure. While the compositions and methods of this invention have been described in 
terms of preferred embodiments, it will be apparent to those of skill in the art that 
variations may be applied to the compositions and/or methods and/or apparatus and in the 

15 steps or in the sequence of steps of the method described herein without departing from 
the concept, spirit and scope of the invention. More specifically, it will be apparent that 
certain agents that are both chemically and physiologically related may be substituted for 
the agents described herein while the same or similar results would be achieved. All such 
similar substitutes and modifications apparent to those skilled in the art are deemed to be 

20 within the spirit, scope and concept of the invention as defined by the appended claims. 
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WHAT IS CLAIMED IS: 



I . An isolated polynucleotide comprising an s-ship promoter capable of promoting 
transcription. 

5 2. The isolated polynucleotide of claim 1, wherein the promoter comprises at least 
20 contiguous nucleotides from SEQ ID NO:l. 

3. The isolated polynucleotide of claim 2, wherein the promoter comprises at least 
50 nucleotides from SEQ ID NO: 1 . 

4. The isolated polynucleotide of claim 3, wherein the promoter comprises at least 
10 1 00 nucleotides from SEQ ID NO: 1 . 

5. The isolated polynucleotide of claim 4, wherein the promoter comprises at least 
500 nucleotides from SEQ ID NO:l. 

6. The isolated polynucleotide of claim 5, wherein the promoter comprises at least 
1000 nucleotides from SEQ ID NO:l. 

15 7. The isolated polynucleotide of claim 6, wherein the promoter comprises at least 
5000 nucleotides from SEQ ID NO:l. 

8. The isolated polynucleotide of claim 7, wherein the promoter comprises about 6.3 
kilobases from SEQ ID NO:l. 

9. The isolated polynucleotide of claim 8, wherein the promoter comprises about 7.6 
20 kilobases from SEQ ID NO: 1 . 

10. The isolated polynucleotide of claim 2, comprising SEQ ID NO: 1 . 

I I . The isolated polynucleotide of claim 1 , wherein the promoter is capable of 
promoting tissue-specific transcription. 

12. The isolated polynucleotide of claim 1 , wherein the promoter region is operably 
25 connected to a heterologous nucleic acid. 

13. A nucleic acid comprising a promoter operably attached to a nucleic acid 
sequence from an s-ship gene or a portion thereof and a marker sequence, wherein the s- 
ship gene is disrupted by the marker sequence. 

14. The nucleic acid of claim 13, wherein the promoter is an s-ship promoter. 
30 15. The nucleic acid of claim 13, wherein the promoter is constitutive. 

16. The nucleic acid of claim 13, wherein the promoter is inducible or conditional. 
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17. An expression cassette comprising an s-ship promoter operably connected to a 
nucleic acid segment. 

18. The expression cassette of claim 17, wherein the nucleic acid segment is 
heterologous. 

5 19. The expression cassette of claim 18, wherein the nucleic acid segment is a 
reporter gene. 

20. The expression cassette of claim 19, wherein the reporter gene encodes a gene 
product that is colorimetric, enzymatic, or fluorescent. 

21. The expression cassette of claim 18, wherein the nucleic acid segment encodes a 
10 therapeutic or diagnostic gene product. 

22. The expression cassette of claim 2 1 , wherein the therapeutic or diagnostic gene 
product is a polypeptide. 

23. The expression cassette of claim 21, wherein the therapeutic or diagnostic gene 
product is an RNA molecule. 

15 24. The expression cassette of claim 23, wherein the RNA molecule is an siRNA or 
miRNA molecule. 

25. The expression cassette of claim 21, wherein the nucleic acid segment encodes a 
therapeutic gene product. 

26. The expression cassette of claim 25, wherein the therapeutic gene product is 

20 selected from the group consisting of a tumor suppressor, a cytokine, a cytokine receptor, 
a differentiation-inducer, growth factor, and a growth factor receptor. 

27. A vector comprising an s-ship promoter. 

28. The vector of claim 1 , wherein the s-ship promoter is operably attached to a 
nucleic acid segment. 

25 29. The vector of claim 28, wherein the nucleic acid segment is all or part of an s- 
shipl coding sequence. 

30. The vector of claim 28, wherein the nucleic acid segment is heterologous. 

3 1 . The vector of claim 27, wherein the vector is a plasmid, YAC, BAC, or virus. 

32. The vector of claim 27, comprised in a pharmaceutically acceptable formulation. 
30 33. A host cell comprising an s-ship promoter operably attached to a heterologous 

nucleic acid segment. 
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34. The host cell of claim 33, wherein the host cell is eukaryotic. 

35. The host cell of claim 34, wherein the host cell is an embryonic cell. 

36. The host cell of claim 35, wherein the embryonic cell is a blastocyst cell. 

37. The host cell of claim 34, wherein the host cell is a hematopoietic cell. 

5 38. The host cell of claim 34, wherein the host cell is a stem or progenitor cell. 

39. The host cell of claim 38, wherein the stem or progenitor cell is from tissue 
selected from a group consisting of skin, a hair follicle, cornea, embryo, gonads, 
mammary gland, pancreas, and smooth muscle. 

40. A recombinant host cell in which one or both s-ship genes is disrupted by marker 
10 sequence. 

41 . A transgenic animal comprising an s-ship promoter region operably attached to a 
heterologous nucleic acid segment. 

42. The transgenic animal of claim 41, which is a mammal. 

43. A mammal having cells comprising an s-ship transgenic sequence. 

1 5 44. The mammal of claim 43, wherein the s-ship transgenic sequence comprises a s- 
shipl coding sequence flanked by loxP sequences. 

45. The mammal of claim 44, further comprising a heterologous nucleic acid 
sequence encoding a Cre recombinase. 

46. The mammal of claim 45, wherein the nucleic acid sequence encoding the Cre 
20 recombinase is under the control of an inducible or conditional promoter. 

47. A method for expressing a recombinant nucleic acid in a stem or progenitor cell 
comprising: 

a) transfecting the cell with an expression cassette comprising an s-ship 

promoter operably attached to the recombinant nucleic acid, wherein the 
25 nucleic acid is transcribed. 

48. A method of screening for a candidate substance that regulates activity of the s- 
shipl promoter comprising a step selected from the group consisting of: 

(a) contacting a nucleic acid comprising an s-ship promoter with an s-ship 
promoter binding protein and the candidate substance under conditions 
30 that allow binding between the protein and the promoter and determining 
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whether the candidate compound modulates the binding between the 
protein and the promoter; and 
(b) contacting the candidate substance with a cell comprising the s-ship 

promoter operably attached to a reporter gene coding for an expression 
5 product and assaying for expression of the reporter gene expression 

product. 

49. A method for identifying stem cells in a population of cells comprising: 

(a) administering to cells in the population a nucleic acid comprising an s-ship 
promoter operably attached to a reporter gene. 
10 50. The method of claim 49, wherein the cells are in an organ. 

5 1 . The method of claim 49, wherein the cell are in an animal. 

52. The method of claim 49, further comprising sorting cells based on expression of 
the reporter gene. 

53. A method for screening for a modulator of cell function comprising: 

15 a) transfecting a stem or hematopoietic cell with an expression cassette 

comprising an s-ship promoter operably attached to a nucleic acid 
encoding a candidate modulator; and, 
b) assaying the cell for a cell function, wherein a difference in cell function 
in the cell as compared to a cell in the absence of the candidate modulator 

20 is indicative of a modulator. 

54. The method of claim 53, wherein the modulator is a candidate therapeutic agent 
for the treatment of a blood-related disease or condition. 

55. A method of treating a patient with a blood-related disease or condition 
comprising: 

25 a) transfecting a cell with an expression cassette comprising an s-ship 

promoter region operably attached to a therapeutic nucleic acid; and, 
b) administering the cell to the patient. 

56. The method of claim 55, wherein the cell is a bone marrow cell. 

57. The method of claim 55, wherein the cell is autologous. 
30 58. The method of claim 55, wherein the cell is allogeneic. 
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59. The method of claim 55, wherein the blood-related disease or condition is a 
blood-related cancer. 

60. The method of claim 59, wherein the blood-related cancer is leukemia, 
lymphoma, or myeloma. 

5 61. The method of claim 55, wherein the blood-related condition is anemia. 

62. The method of claim 55, wherein the blood-related condition can be treated with 
stem cell replacement therapy. 
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ABSTRACT 



The present invention concerns s-ship promoter compositions and methods using 
the promoter. It includes polynucleotides, vectors, host cells, and transgenic animal 
5 including an s-ship promoter controlling the expression of a heterologous nucleic acid. 
Methods of the invention concern methods of expressing a heterologous nucleic acid is a 
tissue-specific, developmental-specific, or temporally controlled manner. Other methods 
includes screening methods and therapeutic methods. 



25395996.1 



-82- 



This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 



BEST AVAILABLE IMAGES 



Defective images within this document are accurate representations of 
the original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 



BLACK BORDERS 

TEXT CUT OFF AT TOP, BOTTOM OR SIDES 
FADED TEXT 
ILLEGIBLE TEXT 
SKEWED/SLANTED IMAGES 
COLORED PHOTOS 

BLACK OR VERY BLACK AND WHITE DARK PHOTOS 
GRAY SCALE DOCUMENTS 



IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problem Mailbox. 




FIG. 2 



FIG. 3A 



Transcription start 




11 5kb-GFP transgene 



Ex6 



Translation start 

P SWOpolyA 
E*7 Jfcy SspH&oRV 



6.2kb-GFP transgena 



ftssHII 




FIG. 3B 



[e] TRANSENE COPY NUMBERS 
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60 0 AGAAGT GATCT GGGCT GGACAGACTAGCTGAACT GGCCA GCTCT GGGTT C 

55 0 ATCAAGAAACCCTACCTCCATAACATAAAGTGTGATGGAGAAAGGCACCT 

Areb6 

50 0 AATGTCAACCTCAAACCCCTACCTGCATGTGCACACACATACATCCACAC 



45 0 CACACACACACACACACACACACACACACACCACACACACACACACACAA 

Nkx3.1 Fox factor bindin g sites 

40 0 ATAAAT AAGTAAATAAATAAAATATTTAGCTCTC CAGACCAAATCTTGGT 

Oct4 Pax8 



350 GAAACCCATGC ATTTGCAT TTGTGTGTGTCCTACAAACACTGAAGGTTAA 

Cdx2 

30 0 GAAGCATGCTCCTTAGTAATTTTATAGCAGTTTG CGTTTCCAGATTGAAA 

Gata / Lmo2 



2 5 0 ACAGATTCTATAGGCTACACAGTGCTAAATGGATTATGCTCAGATACAGA 

_ Smad3/4 
2 0 0 TTG AAA AGGAT AC AGATTGAA AAGGGTCGGGGTC TGGGC CAGGA TG ACG G 

P 53 Stat1/5 

15 0 GCCAAC TATCT TTGCC CGGGC TTGTCCTT CAGGG AAGGG TT ACAGGATT C 



Gata / Lmo2 

10 0 ACC ACT GGGGTG TGGCCTATCTGCT GTTA GGACC TG AAT TG CCT GGAGT G 

lniti ator stem-SHIP region ^ 

5 0 TTTCTAGTTCCCACTAGTTGTTGAACTTTACCTTGAACCTCTGCTCCCAG 



FIG. 4 



SEQUENCE LISTING 



<110> ROHRSCHNEIDER, LARRY R. 

<120> METHODS AND COMPOSITIONS INVOLVING S-SHIP PROMOTER 
REGIONS 

<130> FHCC:016USP1 

<140> UNKNOWN 
<141> 2004-03-18 

<160> 15 

<170> Patentln Ver. 2.1 

<210> 1 

<211> 100140 

<212> DNA 

<213> Mus musculus 

<220> 

<221> modif ied_base 
<222> (27350) . . (78168) 
<223> N = A, C, G OR T/U 



<400> 1 

ggcaatttct 

tgctggagtg 

tggtccctgg 

gagccggcaa 

cactctgcgt 

tgtggatcca 

ggtgggacgt 

gaggcccact 

tggtgctagg 

gtggtgccag 

ctcaggcctc 

agggactcag 

ggtcttagag 

ttcctcgtgt 

ctgagtaggt 

ccccgttggt 

aacaacagag 

tcttctgagg 

ctggagcagg 

tctgatagat 

ccaatcttca 

atgttgggac 

gtgtgggttc 

aaacatcgag 

tcctaccacc 

tgttcacaga 

atcttgtgct 

aagcactgca 

gaggccctga 



gagaggcaac 
tccgtcctgg 
gtggaaccat 
ggacgggagc 
gctgtgagta 
aagggggaac 
gactggcact 
ttggaccttg 
gctctctgag 
gggtctccgt 
tctggtggag 
tgaagggcaa 
ctaattctat 
acacatttta 
ctctgtccct 
acccccccac 
gaaacagcca 
aaggccgcca 
gcaggaacac 
tgtccctgga 
tgcaggaggg 
ctgctgtgct 
tatgaacacc 
gataaggtcg 
atcaggctag 
gttgggggca 
ggtggagtta 
ggtggcctgg 
aaggaaagag 



aggcggcagg 
gagtggctgc 
ggcaacatca 
ttccttgtgc 
cccgtctcct 
ccctgtaatg 
tcgttgccct 
gcgttcgagt 
tgactctggg 
gagttcctgt 
ggtgtattgg 
ccttggcaaa 
ttggggagct 
ctctaggctg 
ccttgcactc 
agcccagtgt 
cttgctgaag 
cccctatagt 
tgtcaggaag 
ctgagagaaa 
aagtgggtga 
aggcatttgt 
tagacctgcg 
gtcatttgct 
gcatgccagt 
ggggcactag 
gcagccagct 
acctccccaa 
ggcaggaagg 



tctcagccta 
tgacccagtc 
cccgctccaa 
gtgccagcga 
cccaactgtc 
ggagtttgag 
gtggggaggg 
tcagggagcc 
cctcccctat 
ggcaggagca 
aatgcatttt 
agggtcccct 
gagtccgggg 
caaggacaca 
agctatgtcc 
gaagatgttg 
gttccttttt 
cctgtggtca 
ggtgtgccta 
gatctttcag 
ctgaccacag 
tggctagttt 
attaaggacc 
cccagagacc 
ggacttgaat 
aagcccgtac 
tcctgcacac 
agcactgtct 
tacttttcag 



gagagggccc 
caggagaccc 
ggcagaggag 
gtccatcccc 
agatccaggg 
ttaggtttat 
gagaaggggg 
tgtgtcatga 
actgcagccc 
gggacagagt 
ggtcagctca 
cctccaccct 
tgcatttaat 
ggaagccaca 
ctccctatcc 
ctgaatatgc 
aacactccgt 
gaccctgtcc 
cttgaacagc 
agcagctagc 
actgttctga 
catcttcaag 
atgagggctg 
actagcttgt 
ggaagtaaga 
aggctcacgg 
ccaccagcat 
cttggggaat 
tgtgtcacaa 



tgaactactt 
atgcctgcca 
ctactttcca 
cgggcctacg 
accactgagg 
gtcataggat 
ggcagcatct 
caggcttgtg 
tccatgacct 
caggaagaaa 
agcgtcagtc 
gctagtatgt 
caggatagga 
gagctgctct 
aggtctctgc 
tggttatcct 
ccgtctagcg 
aggcttcagg 
acaggtacca 
tgcccccccc 
gctctgactc 
agagccagga 
gacaggttac 
gctgcctacc 
ggagctagag 
ctggctgagc 
atttcaggag 
ctaccagtga 
gctcagcctg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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actctactaa cccagttata tttttttctg tcccagtggg atttgggcca agggcaattc 1800 

tgttgggagt tggctatggg ctgagagact gttgtatgca ttggttacaa atgcacacca 1860 

cggtatgggt tctcttgtgc taaactggca gcatctggaa ggctggagtc agagcagaag 192 0 

gagcctgagc caagaccaag ggatcttgga ggaccttcgg gcaacacaga ccttgccttt 1980 

tttcttcatc tgactcccct gcctcagctg tcttaagtcc aagcaaagca aagatgacac 2040 

ggagattttc aaactaaagg aatgactacc acaacctcag tgttctataa tgaccccccc 2100 

cacacacaca cacaccctaa aaatgtagga aaggcaggac agtggtggtg cacaccttta 2160 

atcccagcac tcgggaggca gaggcaggca gatttctgag ttcgaggcca gcctgatcta 2220 

caaagtgagc tccaggacag ccagggctac acagagaaac cctgtctcga aaaaaaacaa 2280 

aaaaacaaaa aatgtaggac aggcatactt tttaatttga aaaattgtta gaagcctgcc 2340 

ttttcatgca aagagactta acttcctgaa aaaaacaaac tttagatcct tactttctcc 2400 

tgttctctgt ggatgatgga acctccccca cctcatcgct gccccctcgc catctgccct 2460 • 

ggccagagtc caggctcctg cccaagaaga taagtcagca gcttgtagga cagcaacaag 2520 

gtcgaggtca gagatggggc tgtgaaggag agatggggca tggggtacat gtgggataca 2580 

gggcagctga gttctctttg gtttcaggag tgatagattt ggcacgtgtg gtgtcttgct 2640 

ggacatcagc cagtctgtgt gggctctggg gcaggcagct gtgggtcagg tgccgctggg 2700 

actccatcca tgttccttct tgtgaccgaa gggacaccaa caggggccca gtgcatgtgt 2760 

tgggtttgtt tggccctctg ggaagaatgc agcattgtga ggagaacctt cctgctctga 2820 

gatttgacta gacatgacat gggagaggga ggtaaatgct taaagacaag gttgcaattc 2880 

agttccaccc acgtgacacg agtgcacatt caccccacac attgaccttt gtttccttca 2940 

gagagatcaa tcctgtcact tatcaccaag caagaaggct tttctttctt ctcagtcatc 3000 

attcagagag ccttggttta ggggagtctg gagttacact ggggccccgg agcactggcc 3060 

tggggagacc ttgctagtat gactggagtt ctgttctacc ttccttgaaa gggaatgtgt 312 0 

gccttttgag tggggcctgt atcaccttca cttgagtaga gcctgtgtca ccttcactca 3180 

tttgtactca gttcctccgt gatgtcagct cccttcccag gagcctgtgc accctgttgt 3240 

ggggtattag ccaggtggat ggagacctat taagagtctc atgagcaggg acagcgcagc 3300 

tacaccatgt gttgcagaga agacaatgct tctgagggta gctcagcaga tgccgggttt 33 60 

gtgggtcatc ccagtagctt cttattgctg aagctacata gcaagaattt gaatgatgac 3420 

ccagcacttg gaaacaactt gctctttcta aaagagatga caaggccaga ttcaacttgg 3480 

tcaagatgac tgttgtctat gtgaatggca tttcccctaa actactctgg agtgcttcct 3540 

cccttgcagg gagaatatgg ttgcctctgg tccagccccg atgaaggatt ccctaagcaa 3 600 

gtggtcttcc atagtgcacc caggcctggt ggtggggtaa gctctgtccc agggatagaa 3660 

tgccaatagc cttggtagct ctggcagtgc agaaagaaag gagaaaaggc atgggacatt 3720 

tacaaccaaa actgccctca gagaaggcat tctagtcttt tgaaagaaac ggtgtgacca 3780 

gacactgggt gtgataagcc tgccagggga gataaaaaca ggccctgctt ttaggactac 3840 

tggagagcag ggtgaagatc acacacttat actgtctcac acttgttctt tggtaggaag 3 900 

agaactgcag agaaggagtt ggtgagggtg aaaatgccta gcagagggag tcagggcatg 3 960 

aggtcatctc ccctccccat cctccattga aaatgtctat aaggtttcca cagcatgata 4020 

atggcttttg tgcaaacaca gtgtggcact gttttccata tctggcatca aaggcaaatg 4 080 

agggaaatac ctgcataggc agaacccaga gctgaaggcc tatgggctcc tgagaaacat 414 0 

gagaaaaggc ctttgtttga gaaggatgct cttgaagact cattgtgtcc aagaccagga 4200 

ggaagggctg aacccaggag ggcctattta aggcctattg tatcatttat gtaagtggca 4260 

gagtgactat gtttctgccc atacccagta ccctggagct gtcttcccag gtacagggta 4320 

ggcactggct taggcgcata ggttaaattc acctacaacg caaggccatt agcacttctt 4380 

aacgatccca tctctctgcc tgccacagag atggaacaag aactgctatc gttacccaat 444 0 

tcatgccagg ccacgtagtc ccaaggagca agactctgga gctgcctgga atctcttgtg 4500 

tcatgtcagc aagacaccta gaaccccagc tccaaagaga ggctggccag accagttcac 4560 

tggctttaca gtgcctcagc tgaggttaag gtacccactg ttaagtcacc catccacatt 4 62 0 

ctagtttgtg gttcaatggt ccttagtaat ggtcacagag ttacgcaact gacaccacca 4680 

tctaatttca gagagttctc attattccca gaagaactgc cccccccccc cacacacaca 4740 

catgtgagca gctgcttgct tttcttctga cctctggaaa gagccagtct actttctgcc 4800 

tttaaggatt tgcctgactc ttgcattttg tgtatgtgaa attatatggc gtgcagcttt 4 860 

tgacatctgg atctgtccac ttagcacaat gccctatgct cattgaattg tagcaggcat 4 92 0 

ccaatggttt gataacccac tgtgtggaca caccacattc cccggttgag tggcgttttg 4980 

cactgttctt actttttgac tgttttgaac aacagttgct gtgaacattc atcaaccagg 5040 

ctctgtgtgg atgtctgttt gcaggagtct tggggagtga gaggcagggc tgggtcatgt 5100 

ggtgattttt atgtttagcg ttttgaggaa gtactgaact cttgtgctca gctagagctc 5160 
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taaccagaca ttgtgagggt ccagtttctc ttctagttac tctctctctc tctctctctc 5220 

tctctctctc tctctctctc tctgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 5280 

tgtgtgtgtg catgtgtaca tttgcacact gtgtgtgtgt cagccaaaag acaacttcag 534 0 

gagttagttc gctccttcta ccatgtgggt cccagggact gaactcagat agtcaggctt 54 00 

aatgacaagt acccttaccc tctgagctac ctcactggcc ctaccttttt catttcttta 5460 

tttaagacag aggctctcaa tgccctggaa cttacgtagg ttaggtgggc tgatgttaga 552 0 

agcaccaagc ccagttagtg gtggtcggtt gcttgttaat gtagtctggg catcaaactc 5580 

aggtgcttat gctttcaagg gaagcacttt atcagctctt acctccccag cccctcactt 5640 

gtttgtgtgt ttgtgagcta tggtctcaca tagcctaagc cagcctcaaa ctccctattg 5700 

tagtcaaaat tggctttgaa ctcttgatca tcctgcctcc actttctaaa tgtattcacc 5760 

acaatatctg gctgtctttt tattctaccc agcttagggg ctgcaaagca gcacagcatc 582 0 

taactgtggt tttagtttgt atccctaatg ttaataatgc taatgggtca tttgtatgtc 5880 

ttacttggag aactgtctgt atagtctttg cacattgaaa tagctatatc atttcaatga 5940 

taaaatagga agacaggagt aatgtggata ttccatagcc taacttgaaa cctctaggtg 6000 

ctttgcatat gtaaaaatag agctggccat tagtttttag agctgaaagc aaccagtgat 6060 

atctcagaag cagaagaccc atctgtggag aactttccat gacaggagag gagagctgtg 6120 

acagtgtcac ttccgggact tcctggaggc ctctgggaga cacagagctg tcatgtgggg 6180 

cctccacaga ggaagtgctc aaagtgactg aggcaggaag aggacttcaa tgacgcaaga 624 0 

gtgtctggtg gctgctgagc cacaggggtt gacagcctgg aagcctgggg agggaggagc 6300 

ctgggtacag acagcaagaa atgcttagag gactgggtat gaagatgaag tctaggagag 6360 

aggctggccc ccaccacctt ctgacctgag ctccaactta gtaaagagat gccgaaggga 6420 

tgaagtcctc tgatcgctaa aatgctagct gttctatggg aggaaacacc atgttggtgg 6480 

gcacttgccc tttgagagag agggtgacgg aggtggtgag ttcaatctcc acagcccaca 654 0 

tggtgcatgt ttgtaagccc agttctagga aggtggagac cggagggtcc ctagggttca 6600 

tgggccagcc agactagctg aattagtgag actcagacca tgtctcaaaa caaacaaaca 6660 

ccaaaacaac aacaacaaac cctagagggt atctgaagga ggacatcatg attgtcctct 6720 

gatctctact cacacattca tgtatgcaca catgtgccct tacacacata catgggggca 6780 

tactgttaac atgtgccagc acccatcaat ttgggtttac tttgcaattg aagtaacact 6840 

gtgaagggct ggccaaagga acctctgaaa aaagagagtg cccacgtggc cctgcagctg 6900 

gaataggcag tgtagaggtg gacagacctg gtatgaaaca gcaaagtctt ctttcaggtt 6960 

aacagtagta ttggctgtgc ggtggtacac acaggtgact gcacactccg aaagctgagc 7020 

aggaagattg ctgaaagttt cagaccagct tcccaggcta catagtgagg ccctatctca 7080 

acttaaccct cacagtgagt taagcataaa gtaaaaagtg taaaataaac aaaaattgtc 7140 

acccaatcac taagagacta gtgcaagcca tgtgaatttg ttgatgtatc ttctcatacc 7200 

catttctatc acataatata catacacaca catacacaca catacacact cacatatatg 7260 

tacacaccac ctttcatttg tggtctgggg tttccaacaa catcaatttt tcctgcctag 7320 

ttagaatttc ttccttaaaa tcactcttgc tggaagccta ggcttctctt ccagtatgtt 7380 

tatgctttat ggaatgattc tttgattgac atttacatta ccagcaatct tgctgtaagt 7440 

gtggcagcag caaaaaaggt cttccagcag aattgctaca tatgttactg gttacaaatc 7500 

cccactgcag ttagagcacc agcaggaatc gtcttcaaag actggacaca aaatggcaac 7560 

ttgctttcca gaaaaattga cttaatgtat gcttctggca agcagcatga gaaaatgccc 7620 

acttccttcc aatatcctat tctacggaat gttactttta agaaaattgg agtcataaca 7680 

tgaaaggtca tctgtgatca ttttgttttg cttttgattg ctgatcctgt ttgagaggtt 7740 

tgctggccat tttcttttgt ggatggtcct ctattctctt agcttctttt cttcaattaa 7800 

gactgtgtgt agcttgggaa gaagggttgt gataagtccc aggctaccac ggactgtata 7860 

atgaggtctt atcttaatag agggtggagt ggagctggag atatggccca gtggttaagt 792 0 

gtttgcttag tgtgtgtgag gtcctaggtt caatccccag taccaaggaa tgctgcaagc 7980 

tttgccatct gcctatgtga gttctgtcat gtttgtgaca aatatatctt cacgatggtc 8040 

tgcctttaag gtttagattc tctgacatgg gagggtcact gttttctact gggacttctt 8100 

cctgggattt ctaacacaga gactccttct gcacacagaa atccactcag aaaggttaag 8160 

actgggtctt tgtagctcat ccacttgggt taagggagga acctccagca gtttatcgaa 822 0 

ttacactggg ccccagcctt gtcatctgaa aaccaagtat aagaggatct atgtggtaga 8280 

attgttgcat gtgttgatgt gtgtatctgt atcatgtatc atgatactaa caaatgtagc 834 0 

caccgtggct attattttta tatgttttat tgttcactca atttagcttc aatagctaaa 8400 

tcttcttgga cttaatagat gacaagtgat gaagatttat gtgtattttt ctcaagtaac 8460 

catccttccc caacctttta ttaaatgact tatgcttttt ccctccatga ttcctttgtc 8520 

taccatatgc taagctcttt ctgggttgtt tatctgtcca agaataccca actacaccac 8580 
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tacctttaca aagacataag cataattttt caaaatgttt tggggttatt atctcttata 8640 

ttcaagtgtt ttcccaggta aaccagatag taatattttt caacttggac aaaacttact 8700 

gttatgactt tcactgggat ggcaggaaat cctaatgtaa gctgtgagct tgacagtctt 8760 

attgtgctcc ctctctgatt gtgtaatata tttcacacag accctgagtg tttcttgtct 8820 

gggtctctgc tgatgacttg tatgtttttg ttattgttgt gaatggcaca tttctcccgc 8880 

agttactggt tcatgataag cggttgcctg catagaattt tatgctttca gctttcattt 8940 

tgtcccactc tttggccact cctcattcat ctttcagctg cttctcctgt cttttctttt 9000 

taaagagatc ttgttatctg tggtgtgtgt gataccatcc atgagcacag atgtgtgcgc 9060 

acacatgttt gtgtttgcct gcacatgggc caggaggtga gatgaggaca cctgaaatca 9120 

tgatccatca ctctctgcct tgttcacttg agacaatgct tctctctaag catggggctt 9180 

gctgattttg gctatgctga ctgaccaggt ccagcaagcc tcctgtctct gctacccacc 9240 

caccccacct cagcactgga tttatagatg tgtgtgacca cagctcagtt ttcttaaatt 9300 

ttcagacagg acctcactat gtatccttgg ctggtttaga aatcacaaag atgggcctac 9360 

ttcccaagat ccagtattca aggcacacac caccacacag actctctgtc ctgctttttc 9420 

acgtgggctc cggggatcta aactcaggtc tttatgctca cacagtgagc actcttactc 9480 

actgaacgtt ctacccagcc caacttttct gaataggcct tcatatcatc tgcaaaccag 9540 

gatcatagtg tttgcttcct aatgcttaca accactttct caaatgccat agcctattat 9600 

attggccagt atttctaaag tggtagagtg gcaactgttg gaaaaaggaa acgtctgcct 9660 

taaggagagc ttcacatacc ctgccaccgg gttgcctttt ggtcagggcg ttttattaca 9720 

agaatggaat tgggacatga gggttctggg aaggcagata cctgtcaggg cctccccatg 9780 

ctgttttgaa acaatgggtt tatttaagat agagatagct cctttatatc ctttgttggg 9840 

ttggggccct cttttgagtg gtgataaggt ctggtgattc tttttctttt tctggcaact 9900 

ggggactgag ggaaacagct tcttttagaa tcaaaagcca ttttgcccaa catatcttat 9960 

gaaagagaga cagacaggca gacaggcaga aagaaagaag aaaaagaaaa aggaaaacac 10020 

caagatgaat ggggatgaaa ggacccacca aaccaacaag acacatgtag tattgggcca 10080 

gtggtaaact cagacagaca gacagacaga gggctccctt gtgagggcct aactcccgtg 1014 0 

tctttgctta aagcattagg tcaagagtga ccgcttgggt ccctcctgct gaggagggtt 10200 

ctctccctgt ctctccccca cctctcacct gttggccacc tcaaccctct tcctgttttc 10260 

tctcttttgg aggggcccaa ggaagcaggg gatagcctct gttaatatat caagtctcag 10320 

ccctgtgact ccagtgttct ctaggttgga gtagcaagaa taaacactct tgcttgctgg 10380 

ggcttaggta cattttctct ggttctgtgc tagccaatat gccttgcctc cctctttatt 10440 

ggtgcatagg gacagggtca ctatgtaacc caggttggct ttgaactctc tatactcctg 10500 

cttcaacctc ccaaatgcta gggttgtagg tgcacaggac catacccact tccatatggc 10560 

tttctgaatg caatctttcc attttgactg ttttctaccc acctggccca gaagtgtaca 10620 

gcatttatcc aagaaaagag aaagcacact gcaaagaatg ttttctagaa attagaagtg 10.680 

acacataact agagttctgg tgggatttac aagagcagca tgtggagcga gacacagcct 1074 0 

ggactaaagt ttggactaat gttttcataa gggtgatggt tatgatgctg tcgaatattg 10800 

acagtgctca ggcagggtca tctgagatac ttctgttggg ttcagttctg attttggcct 10860 

taatcgtgtg tggttgtttc tagaatatct tttaaaaggc caggcaagcc gggcgtggtg 1092 0 

gtgcacacct ttaatcccag catttgggag gcagaggcag gtgaatttct gagttcaagg 10980 

ccagcctggt ctacagagta agttccagga cagccagggc tacacagaga aaccctgtct 1104 0 

caaaaaacaa aaaaataaaa ataaaaataa aaggccaggt gggctttttt tcctacactt 11100 

ggttaaattt tcctttttgg gtcagccata tctttatctc tctgtccttt ctttctgtct 11160 

gggagcccat gtgtttcacc ggctccttag ggctattcca taagtagctg ccctcggcta 11220 

aggaaagcca ttctcacgct ccctcctttg cttaggagga agggtcagag ctcctgagag 11280 

ataaggccag gtcagatcag ggcagagatg gctctacaat tcccctgcct ggctctgtac 1134 0 

gggttgagcc taggagccat aggggcgtat aagtgggcgt ggtgaggatt ctctgtgttc 11400 

cagtgatatc tctgagaggc gagatgctat cgggccagat ggtggtgata tcgacctggc 11460 

ctgtgaaatg ggtggggttg gggtggaaca ttaagttgtg actagagcca gaagagcctc 1152 0 

tcaagagaag ctttcccatg gggaatgaag caacgaagtc atgaactctc gaaaaagcat 11580 

tgccccacat ttactttaag tatatatgaa ttcaaaacag taaaagatca taaagctggc 1164 0 

tacatagaaa aatataaatg tatacagcaa cggaagggaa ccaacaaaca ttgttgaaat 11700 

gaatatcaaa atccacagta gcatagatca gagtcttgtc tgataataag cccaggatga 11760 

tgtttaaaaa aaaaacaaaa aacaaaaagc aaagagatgg aaagggattc aaagaaaaaa 11820 

aaaacagaaa cagccaagaa caaggacaaa ccaaaccttt ctcgtgggga gaggcaaatt 11880 

aaggcaagct gccatagttc tccccctctc agatctgaaa agagatctga taagccagtc 11940 

agcaggctgt ggggaagtag acactctcat atctgttagg aggaaggaaa aaaaacccga 12000 
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cagactcact 
atggctcata 
agaagccgac 
ctatgccttt 
ccctgacacg 
gatgctttta 
atgaagaaac 
cggatgtagc 
tgttcaccac 
tggtcaaggg 
ctgtcatgtg 
tgtataaaag 
gggcagggaa 
ctcaatgtct 
gttttaccaa 
aaaccaccac 
acataaagag 
acatatatac 
aggataagtc 
ctgtgtgtgg 
cacatagccc 
ttctcttgcc 
atgatgggaa 
tgcacccagc 
cagcactcag 
gtgagttcca 
aacaaaacaa 
gtaaatgtga 
tgaggttgat 
tgcagtgtct 
ggtgcaggcc 
gtgctcttaa 
cattgttgat 
actgttcatg 
ggaagacctg 
gaactcctga 
ggtgactggt 
tcagagattc 
taaaccaaaa 
gagctctcca 
aaaacaagga 
ctggggagcc 
catgatacgg 
cattctcttt 
tttagatcag 
tggtgtgctg 
ctctgcctca 
cagactcccc 
atggtttgta 
gattctgccc 
tttccctggg 
tgagaaccta 
ataacagtga 
ttggtggcca 
ccttgattta 
agcagaggga 
tcccatagct 



ggagcgcagc 
cctgacatcc 
ctaggctaca 
ttgtctggct 
tgtggtgtcc 
gtcttcccag 
aaaaacaaag 
catgtataca 
aacattctta 
catgacaata 
tgtcttggtt 
gatatcttgg 
cacggaagca 
acccccagtc 
ctgggaacca 
aatgtggaag 
taagatttta 
acatacactc 
taagaaagtc 
tgcctttttt 
aggctggcct 
tctacctcct 
ctgagcttag 
cctcgatata 
gaggcagagg 
ggacagccag 
aacaaaactt 
aattgtagtg 
ttttatatca 
gtggagacca 
tccatgtggg 
ctactgacat 
agagttttac 
agaaccccaa 
gtgatggtga 
ggccaccaac 
cacctacggc 
taggaaagtt 
atgtagaagg 
acccagtgtc 
gtccaggagg 
tcaccctggc 
atcttgctga 
gatgatgtgg 
accacatagc 
aggggcagga 
ccatcatccc 
cccaagctgt 
atgccgtctt 
aatgaggacg 
agaaaagaaa 
gtagagctta 
ctcctcatag 
tggctgagaa 
ttcttttgtt 
ggtgatcttg 
cccttctgcc 



tgtcagtctt 
tagcacttgg 
taaaagtgtg 
gactctgtct 
agtgtgagcc 
caacgaatag 
acttcagcca 
cacacagatg 
cacaggcaaa 
catgtaatgt 
acttccctat 
aattcatgat 
ggcaggtagc 
acacacctcc 
agtgtttgaa 
actttgatga 
tatatttaaa 
acaatctttt 
atggatgata 
caatatataa 
ccaaccctct 
gagtactggg 
ggcttcgtga 
aaacttaaac 
caggtggatt 
ggttatagag 
aaactgagtt 
ttgtgtggca 
tgggtgtatt 
gaagaggatg 
tgctaggaac 
ttctctccag 
agcccccact 
atagaggaag 
tgcttcccaa 
agacccccat 
cccttcccag 
aaaaaaaaaa 
agctcttggc 
tgaaaggggt 
agccaagtag 
cgatgtgctc 
gctcctggga 
tagctgaggg 
tatctagcag 
acgctgtgct 
tcagaccacc 
gtcagacagc 
tgatttattt 
ataaattcac 
aaagagaaag 
gctttgtccc 
atgacagtta 
aaggacacag 
tgacttccat 
caagagagat 
caaagcattg 



tatcaaaatg 
gagactgagg 
cgtgtccacg 
agttcattgt 
aggcgctgtt 
atatggcatt 
tgtatgcatg 
ttgccatgta 
acattcaaaa 
ggtagaatgc 
tgctatgaag 
catccttagg 
tgagaactta 
ccaacaaggc 
cataggagcc 
tattttatga 
aaatttaaag 
atgcatgtgt 
ggagatataa 
tctttttttg 
atgtaactga 
attattggtg 
atgttaggca 
tgagccaggc 
tctgagttcg 
aaaccctgtc 
taatagtaac 
tgtggaaggg 
tcctgcatat 
ctggattcct 
ggaacccagg 
cctccctaac 
gcctttgggc 
acgggtaaga 
aactgctcac 
ggccaccatg 
agaatcctta 
aaaaaggtgg 
tggttggcac 
aggactcatt 
ggggaatgag 
cttccgcagt 
gttttggctc 
gttgaggaag 
tgacgtaaat 
gtatgcagga 
gtcatccccg 
cctgttctat 
caggttccgg 
tgttcaggta 
gaaaaacagg 
ccaaagctat 
taagatgata 
tggtgaagga 
ggatgagcag 
gggagattag 
gtcatggcat 



taaaatgcca 
caggaggatt 
caagcatgtt 
ttatccacat 
ccaaaggctt 
gctgtcatct 
ggttaaccat 
gccacatata 
taactaccac 
tgagaacatg 
agacaccatg 
cttagaatcc 
catcctgatc 
caacctccta 
tatgagaaac 
agaaagcaaa 
atatatacac 
tttaaaatat 
tttttttcat 
tttgttttgt 
ggctaacttt 
tatgtcgcca 
aacactacca 
gtggcgcacg 
aggccagcct 
tcaaaaaaaa 
tgtgaaattg 
attagtgaag 
atctatgtgc 
tggaactgga 
tcctcctaga 
tctcatttct 
aataggagta 
ggccattcta 
agctcagctg 
ctaccagata 
attttaataa 
gggagggggg 
ttcctctata 
gcaggaagaa 
gtctcagcag 
tgtagctgag 
ttcacacctg 
cccctctaga 
ggcatatcca 
ctgtgaggca 
acatttgcag 
ctggctctgt 
aattgtgttt 
agtcccaaac 
tattctgaat 
gtgtctgggg 
gtcatgcatg 
aagccaagcc 
atgtcagtgg 
atgagatggg 
ctgaatcact 



gcaagacatg 
atcttgtgtc 
tgtaaagtgc 
aactagcaac 
tcgatacatg 
ccattttaca 
gtatgacaca 
catggatgtt 
caccaggctc 
agcaaaccat 
accaaggcaa 
atgattatca 
cttttacaac 
atctttccca 
attcgcattc 
ctttagaata 
acacatatat 
tttataaata 
tctatatctt 
gagacagtct 
gaacttttga 
ttcctggctt 
accaagcaaa 
cctttaatcc 
ggtctacaaa 
caaaaaacaa 
taagttaata 
tgacttttta 
actctgtgca 
attaaagttg 
agatcatcaa 
agattagaag 
cctagctttc 
gcaggattaa 
tggggccact 
ccagatggct 
gttcttcctt 
ctgggattct 
gaaacaacca 
ggaagccatt 
aaggtgggta 
agctggtcat 
tctcttgctc 
tgatggagca 
tggccactct 
tcaggttcag 
gctttgtacg 
gttttcctgc 
acacttacag 
caacttgtgc 
agttgcaatg 
aactgagtga 
ctaaggatta 
agtgcctatg 
cccaggaagg 
gcctttaaat 
gtaaaaaggc 



12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460. 
14520 
14580 
14640 
14700 
14760 
14 820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
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caaatataca aagccataaa taaacttgtt gattgattga ttgattgatt gattgattga 154 80 

tgagttatgc cacagtatgt gtgtgtagag gtcaaaggac aacttgaagg agttggttct 15540 

ctctttctgc catgtgggtt ctggggattg aactcaaatc ctaaggcttg ggtggttact 15600 

gcctttaccc actgaggcat ctcaccagcc acagctggat tttaaaaggg aaacaaacaa 15660 

acaaacaaac aaacagacaa acaaacaaaa aatcatcagt gccaatcaaa gccggggaag 15720 

accacccttg cagtgtttgt gtgacaaaca cagggacttg tctgaccaca gagtgtggca 15780 

gctgagtgat gtggaagcca gagaagctag tttttgcctg acttcattgg ggaagcatgg 15840 

gggcgtgcag aaaggaggtg acagccccca tgtcctcagt agaggagagg gctgtgtgtc 15900 

attaagctca cagcgaccat agtaagagtc acaacagcag atgggacatc tgtgtgagca 15960 

gtgcagtaag gaaagacaag catctaggga ctgggtcacc agaacaacaa ggagcaaagt 1602 0 

cttcaggatt ttgctgaaaa gaaaggggtg ggaagggaga tgtgaggcat gttggtggct 16080 

tccggaatgt gacaggaatg atgaattgtt tcgatcaaag gggtagactc agagctgcct 16140 

tgttgaaagg cagtccttag gcaaaaggcc ccagaaacat ggaagcgaga gagaagttat 16200 

gaacagaggc aaagcctcgg agggcagctc atgatggaat tcaaaagaaa ccacatcagt 162 60 

acttccaaca gtcacttcct ccagacggtg ggactcgaag ggggccagtt ccatccaggg 16320 

tgtggcaaag cttccgtcta ggggaagggt ctgctcatac tgagcttcat gggagaggcc 163 80 

agcgtgaacc actgtgggct ggagtaacgg ctgtactgca tagcatggtg gtcaggctga 16440 

gcacagttca tactttttct cgtgtgcgca tgaacaccag gatggaaggc ggggcggatg 165 00 

ggaggcaggg cggatggaag gtgtggtcta ggggtcaggg agtccctttc ccagcccagt 16560 

cattggtggg aagcaacaga acaaatctga gtacccgcaa gcaagcgccc ttttttaagc 16620 

ttggcaggtg ttcgttgcct gaatactttc tgagatgtgg acgccaccag cttctggaga 16680 

catggctcag gggaggtggc ttaggcatgc agccgtggtc tctacttttt tttttttttt 16740 

ttttttttct aagcctagga ctgtgtgaac tttgcctggt ggcttggtgt ccttggcaag 16800 

tggcttcctg ctaggagtca actctcttaa gttaaaagtt agtcggcatc tttgatcttt 16860 

ctcccctttt gtggtttctt ctgttccttc aaagtgatga agactttaag tatgactgga 16920 

aaagaggtga ctctgcggct tcttgttgag ggtttgtgcc cacagcttgt gaacccataa 16980 

ctctgcctgt ctggagtctg tacccacgtt ctcttcactc ttgcttccgt ggcttcctgt 17040 

gaccccagaa atgaccacag accttcagtg ggactaggag cttcagtgtg ctttgggagt 17100 

ccctctggct gatttttttt tttttttttt tgcaaaggtt tagtgccttt ggtggcctga 17160 

aggacactag agaagcccag gccttgggac cagaggctct ggggccctgg ctgggttcta 17220 

cctcagtgag tcagctcttc tctgtgggtt gctcctcccc tgctgacagc actgctcttc 172 80 

tcagatccca gagggaacct gattggctca gccaccaaca gctctctctg cctgaaccac 17340 

tcaccagcta ctgagctctt tgctgcttgg ctgttaggaa gcaggtacct gcccaccact 17400 

gacagatgag gacagatcat agcctcctgt ggctacctca gccagggcag tgagcaggca 17460 

gtttttatag aagagatatg gccttgctgg acacaattgg tttcagtctt gcacttgacc 17520 

tatgaccttc ttggtcatct ataccagcca agggacagga agccatgggc ctaaatggtt 17580 

tgaaagtccc cacctataag gatgtcttgt cctataaaaa acaaacaagc aaacaaaaca 17640 

acaacaacaa aatccccaca catttacaag tgctggagat gtagataagc tcaattggta 17700 

gagagcctat ctagcatgtg cagacatggt tcagttccct agtactgtgt aacacctggc 17760 

atagtagcac atccctgtaa tcttagcact tggggataga ggtatagagt taagaggatg 17820 

gaaagtcttc aggtacatag tgagttagag atcggcacac atgcatgtgc acacgctcat 17880 

gcacacacac atgcatgcac atgtacagaa cacacaccag acatgctaaa acaaacaata 1794 0 

taaggaacaa atcaaagctc aeagccagtg aaggcaaggc agaaacattc ttcaggaagc 18000 

gttggtattt ggccactgct tattttaact ccagttcctc ttttcatgtt gaagcaaaac 18060 

ctgcattctt tcctgagctt agccacagtt tctgtataac tcagtcctta tacctaaacc 18120 

tgcttagtgt cagcagcagc aggagcaagg ctgagactgc acagcgacac accaagggcc 18180 

ttcatggtcc cagctcttct tatgggaggt ccttgttgaa acattcattc caccttcaat 18240 

ctttcccaag gacaatctgt atctcgtggt ccagaaggct cactgatgga gggaagcacg 18300 

acccccacca ctcagggact cagctctgac ttcattactt atagacctac aaaatcagag 183 60 

gtgggaaggg tggcagcagg attctcttgt ttaacgcggt agacactagc gttcacaacc 18420 

ctgcttccaa gctaagtctt ctttctcagg gctcagaaca tgtttccagg cataaatggg 18480 

tactgaaaca cttcggcagt atgggaccca agaccatttc acggagggga ggcagggtag 18540 

tggtctctct gagatgggcc cttccttgcc tatttggatg gtatttgatc atgaatagca 18600 

cttggctgtg tgtcctgtgc attcaggatt ttgcagccac atggctttgt tcagggactg 18660 

tgaagagacc aaacaggaag atagaatgtt acatctcccc tgctgtggtc acaattctgt 18720 

tcacgaatga cagggatggt ctccatggag gccattgttc tctttactgc ccttgttctt 18780 

agtaaggaga catccatgtc cagttcttgg taagaaggga gacagggggc tggtgagatg 18840 
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gctcaacggt 
cacatggtgg 
gacagctaca 
aagaagggag 
tgggagggaa 
aaggaataat 
tgtggacgtg 
agctgcaaag 
tggaggatgt 
gagacaggtt 
acatcctctg 
cgggccccgg 
tctacaaagg 
agagtgagct 
ggagggtctg 
gaagagctct 
taggggcctc 
ctcatgattc 
ttggggataa 
agaggctttg 
ccccactagc 
tccctggtga 
cacgttttca 
atgaagttca 
agattccgtc 
acaaggaaga 
ccctttgtgt 
tttattttat 
cttgcagctg 
ccccagaggt 
cccaaggcca 
tcctttcctt 
tcaaaggttt 
ttctcagatt 
cttactccca 
gcgcctggag 
gccatctctg 
ttgttgctaa 
ctgcatagca 
ttgcagaggg 
acaagacaaa 
gtttcacaac 
agataaaatg 
tctaggagct 
acatatagga 
ctagggggga 
gcacaggtcc 
agtgcatgct 
actcaatatc 
agcgacctct 
ttctcatttc 
agagtctttc 
acatagccga 
ggatcagagt 
agaaatagtt 
tggctcagtg 
agcagctgag 



taagagtgcc 
ctcacaacca 
gtgcacttac 
acagccacat 
agagatgctg 
gagatgaggg 
atgggtacgg 
gtgatggagg 
ctatccctgg 
taacgcagat 
ggtactgtgt 
gttcccatgc 
aagtgactct 
gttcacccag 
aggcaggaca 
gagcagcctt 
tgggcttctc 
tgtcgcataa 
ggactgtgaa 
agatggtctt 
cagagcctgt 
tgtctgtcac 
gagctggccg 
ggaagcgggt 
tgagcctctg 
atgcagaaca 
cctggcaacc 
ctgctgtaga 
ggttcatatt 
gtcgttccat 
gctcgtgcca 
ctgtggatcc 
ttcttgtttg 
ctactgagaa 
gaaacaaaca 
ggagagctaa 
cctctaaatg 
ttctccatat 
cagagctgac 
gaaggactcc 
atggctccct 
cgtgagaaaa 
aagccaggtc 
ggcctgccca 
ttctgaaaca 
atctcaaaaa 
acaggctgtg 
gccgggattg 
ctattgctct 
gtgccctgga 
agctttgccc 
ctcggaggaa 
gggcgtgggt 
ctagaaaaaa 
ggatcccggg 
agaccctgtc 
agcactggct 



gactgctctt ccaaaggccc tgagttcaaa tcccagcaac 
tctgtaacga aatctgatgc cctcttctgg agtgtctgaa 
atataataaa taaataaata tttaaaaaaa aaaaaaaaaa 



gttcttcagt 
agaacagtat 
aaagataatg 
tttctacaaa 
atgaagatgg 
ctctgctccg 
tctcagaccc 
gaggccagag 
ctagaggcgg 
cccctcaggg 
ttgttgcctt 
agaagtctta 
tagacatgat 
ttgctgcctg 
atatctctgt 
aactggggta 
tcccacagac 
cgtgatctgt 
attgctctgg 
gtcacaagcg 
ggcagctgtt 
ggcaaaggat 
ggctcagaag 
atattctggg 
tataggtcta 
taaaagggaa 
agcctttgaa 
gttgtagtag 
tgctgagatt 
actttttttt 
agcccagtcc 
agcaaacaaa 
ttgattctca 
gcttatcctt 
cccagaatgt 
actaactgtt 
gtaggggcgt 
ccctcccaac 
ggtgactgga 
ttgaatggtg 
tgggcttcct 
cgtagctgaa 
aggaaaaccg 
gtctgcaacc 
gttctgtacc 
cactcaaatc 
aagatgtctc 
tttgagagca 
ataccatttc 
aaggtggctc 
gaagggctgc 
agctccctgg 
tcaaaaagta 
gctcttcctg 



ctgcaaggtg 
aaaggggggg 
aaggaaggaa 
ggaaaagagg 
aagtgtgctg 
ggcactgaag 
agagagtgtt 
gcaggcaggc 
tgttcgtgtg 
aagtgacatc 
ttggagtagc 
acagacaact 
acctctagcc 
acacttccta 
gaacttcctg 
ctgtgggttt 
gccaaggtta 
gtagatcaga 
gcagttcggc 
tcggggaggg 
attgttctaa 
gtggctcaaa 
ctagcatggc 
gccctgtgtt 
ttttctaagc 
attcttgggg 
aatgtagtat 
agagacaggt 
gatcaatttg 
tttttttaag 
tgccctcatt 
caagcacccc 
ggtgtctggg 
ttctccccat 
tatggtctgc 
gataggctaa 
cttatcaaaa 
tagaatatga 
gaatgggacc 
ctattggatt 
gttagtgaga 
agtggtccaa 
tacacttcag 
tcgatcacac 
atccagacat 
agaaaaagta 
catttctgtt 
ggtgtcattc 
ccccagtgga 
ccaaggaaag 
aatcccagtg 
ccggctaggc 
agacggaagg 
atgaccagaa 



aaaaaagaat 
gactgaaaga 
cagtcaagaa 
tgggtgcatg 
tgggctgtgg 
caaggatcca 
ctgatttaaa 
aggatgctta 
aatcctgggt 
catcttgggc 
tttgctccct 
gtgttgtgtc 
aggtccacca 
tgaagaagaa 
tctctgctga 
tcatgcacgt 
ctcttccaga 
cgcgtgactc 
atctggagaa 
cgaacggttg 
gtggtcacca 
cacacatttc 
caatgtgaag 
ttaactttgc 
ctctgactat 
ccaagtttcc 
gaataagtcc 
attacctttt 
caacacaggt 
ccagtttatc 
tgccagcctt 
tcacctgccc 
ggtaggggtg 
ctctgtgcac 
tttgtttgtg 
gcccagggaa 
agtctttctt 
tcattggaat 
atgcagacag 
agctgcatcg 
tgactctttt 
catacaaggt 
attgagcacc 
ctctctcctt 
ccctaacgat 
caaaaatcaa 
tctccatggg 
agagttgtat 
accttgctgg 
tgctggccac 
cgggtggggg 
tagcccaatt 
gctggagaga 
cccacgtggt 



gggaagaaaa 
cataaaagaa 
agaggaaggc 
cagggcgaga 
aaggtgtctg 
cagacactgt 
gtccatgcgg 
cttagcccat 
ccttcctgaa 
cgctgacttc 
gggaagccag 
tgaggaggct 
ctgaagcgaa 
aaaagcccag 
gccggcagtg 
tcagaagaag 
cagaaggaaa 
aagttcctgg 
ggaacaaaac 
agccgttgtt 
agtgaaataa 
ctgaggttgt 
acagggagac 
tttattttat 
ctcaggacat 
cttcagcatc 
caagcaccaa 
gctcgctgca 
ctttcaagac 
tgttttattt 
ctggaacact 
cttctcctgt 
tttacgctct 
ccaagccgtc 
ttttcttctc 
cagtactcaa 
agaaacagac 
tgctgctgct 
cagagcagaa 
aattaacaag 
gtgttcaagt 
ttggtctatg 
gagagtcact 
agtcctggcc 
ttgtctctgg 
tcagcatgct 
atgatttagt 
tgagcagatg 
atcactcata 
acaaacatga 

gggggggggc 

ggtaagctct 
ttggtgatcc 
ggctcacagc 



18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22140 
22200 
22260 
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catctgtatt 
aagctcacaa 
ataaaagaaa 
catacgcatg 
tgcacacaca 
catggtcctt 
gtagctatca 
actggacagc 
tccatgcctt 
acttctctcc 
tccaactatg 
gcagctcttt 
ctcaaggcta 
tcttagccag 
catctgtgct 
cttacagagg 
tgtaaccagt 
ccaacacagg 
ggtgagactt 
aggtggaact 
cacgaacaaa 
ctggggcgga 
tggctggtac 
agggtcccat 
ggaggcaggg 
atggaaacaa 
caatgtcgct 
acccctctcc 
gttgccagta 
ttggatgcca 
gagctctgca 
aatggtggtc 
tggtatcctt 
acagcttatt 
tcattattct 
atctccatcc 
tgaagtgcca 
gctctgcatt 
aggtactcaa 
acatgaaaag 
aggtgccttt 
caaagtgagt 
aaaaaaaaaa 
gcctgggatg 
gggctgcccc 
gtctgcctgc 
gaagctagaa 
ggctctggca 
aagtggtccc 
gtctgcctcc 
accctaagtg 
aagacagatt 
ttccccggtt 
aggagaaggc 
taagcccagc 
ggctcagtga 
gaaatgattg 



tctaagttca 
agcacataga 
atgtaagtgg 
cacacatgtg 
agaacagact 
atttattttt 
atggattcgg 
aggatttgct 
ctggcacacg 
ccagtgactc 
ccaccaagct 
catctttcct 
tggggtcctg 
gtcttcggat 
tctgaacctg 
tttaaggact 
gcatgatggg 
gtgatccagg 
aagaaccacc 
tgaaccaaga 
gtgggagacc 
gtccttctac 
ctggaccact 
gacttcacct 
acttgggata 
gaccagtgag 
ttcagagcct 
actgtaccct 
gcaacataac 
gaaatccaaa 
ccatttctga 
aatatgtctc 
ttgcatctca 
cacaggttct 
gcctcccaca 
tgagaccaga 
gaaagccacc 
aggtactcag 
taatacttga 
gaaggagtat 
aggacctctc 
tccaggacag 
aaaaaaaaaa 
atgccagggt 
ttgaggcagc 
tgtcactcct 
gggagggcca 
acatggaaat 
aaatgctgtg 
ttccttttct 
agtagtactc 
ttgcttgttc 
gtgtgtacga 
tgatatttcc 
tttcactatc 
ttctcttaag 
cacaggaaga 



aaggctccag 
cacatgcagg 
acgataactg 
tatgcgcaca 
cctcaccctc 
tgaggttctg 
tctgcttgca 
ctgccctgct 
cttttctccc 
actcttgctg 
aacctcagtg 
tcctgtgggt 
tccaggtggt 
cctagcaccc 
tctcctcctc 
gaaagaggtg 
gcaaataaac 
caaggctggg 
aaagtctctg 
ttgagtatcc 
ctagattctg 
tttggaagaa 
ggccatgtct 
agggtcttgt 
cactatttca 
atggctctcc 
cttttccagg 
gttctgtttt 
ccagtagtat 
aatggtcttc 
agacttcaga 
acagtgccaa 
gagcagaaca 
gagatgggga 
gcagctatgt 
agatggatcc 
tacactcatt 
tgcagataat 
tacatccaga 
tttttgtgtt 
tgcacaggcg 
ccaagactat 
aaaataggac 
cgtggagcat 
tcctctctgg 
gtctcccaga 
ctcggggatg 
ctagattgct 
tctaaattct 
ttttcttggg 
taccactgac 
gaagtgggga 
ggagaaggct 
tgtcaccaac 
tccagtcctg 
ataaaggaag 
gcaaggggtg 



taccctcttc 
caaaacgccc 
aagtcaccct 
cacacacaca 
cacctgccag 
aaacccttct 
tcttgagaag 
gcagttgtct 
ggtggtcaag 
gaactacctc 
cctagtcact 
tctctctcct 
tactcccgaa 
aggatgctgt 
aggagtgctg 
gccaaataag 
cagaagtcca 
gacaccagag 
gagtttggag 

a gggggtcct 

tatctaggaa 
ccaggaaggc 
aagagagacc 
tgccattgag 
tttaatcctg 
agacccacct 
cctccccatg 
cctaacatcc 
acagtcatac 
actggctgtg 
aggaaatctt 
cagaatggcc 
tgagcaagcc 
catgaacatc 
tcatgaccta 
aagagaactg 
aggtactcaa 
gtaaatacta 
taccagttag 
tccaaatagc 
gatttctgag 
acagagaaac 
ctctctgcag 
ctggagcaag 
gctggctccc 
gtaacttgag 
taattaagga 
catgttccca 
tagcagggac 
atgccaagaa 
ctacacccca 
gggggaaatg 
tccttaaagt 
atcttctgtc 
aaggacacct 
agtgggtgga 
gggagattaa 



ttgttgcctt 
atccgcactt 
gcatttgcct 
cacacacaca 
gtgattattc 
cagcccacag 
acagaagagc 
ccacagtgct 
ttcttccgtg 
ctgcatttca 
ctctactgac 
gggtcctcac 
gccatgtttt 
gtgcgttctc 
acgcactgtg 
cacagagtat 
gaggctggtt 
gcagaaaaga 
gaaggcagct 
gactggcttc 
aaattgctgg 
agaggcaggg 
tgaggacata 
tctgtttagt 
ggaagaggaa 
gtggctgacc 
ggaaaccatg 
cttagcacta 
acattgatta 
ctaaaatcca 
cctctgtcct 
ctgtgattat 
tagctcactc 
tatagaaccg 
tccattcagc 
agaccttgcc 
taagtagtga 
gatatatcca 
gttctcaata 
cctgtccatg 
ttcgaggcca 
cctgtctcga 
cagccagtaa 
gctaatcaga 
agggccaagc 
ttttacggaa 
taaaaatatg 
ggactgtgtt 
catttaaaag 
ttcaacccaa 
atgccccaag 
catatggtaa 
tagctggcaa 
cagaaacaca 
cgccctcact 
ataatcttga 
tgtggcctgc 



gctagggagc 
aaaaattcaa 
ctggcctcta 
cacacacaca 
aattttatgc 
tgagaatcca 
tgcgggtccc 
gttactggaa 
ccctgagaag 
gtactgagat 
cctgttacca 
gacctcatgg 
cagagaaggc 
ttaaagatgc 
gcatggcttt 
agcataccac 
agctgccacg 
gcgagacagt 
ggaatgccta 
caatcttggc 
cacctgctca 
gatagggtca 
ggggtcagag 
gagcctccat 
tggtactcta 
ccgccacccc 
tttgcccaac 
tctacaacta 
tcttgtattt 
gatatgagca 
tcaagccagc 
cctgattaat 
atgctggatg 
gatggggaaa 
tatcctacct 
tgtcttgtac 
gcaaatgcag 
ggcacccatt 
aatactggat 
ctatttgcca 
gcctggtcta 
aaaaccaaaa 
gactcgaatg 
cagttgccct 
acatccactt 
agtaaagaga 
ccatgtcctg 
gtctgtatcc 
ttatgtgtgt 
ggccttacat 
agtacatcta 
gaagatgaac 
acatgcaagt 
gcctaaccaa 
gcctgcccag 
cctcaagatg 
gaaaaaaaaa 



22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
23100 
23160 
23220 
23280 
23340 
23400 
23460 
23520 
23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24180 
24240 
24300 
24360 
24420 
24480 
24540 
24600 
24660 
2472 0 
24780 
24840 
24900 
24960 
25020 
25080 
25140 
25200 
25260 
25320 
25380 
25440 
25500 
25560 
25620 
25680 
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acggagaaag 
ctatagtctt 
ggtgctgagt 
ggaatcaaaa 
gggaaagtat 
tggtacctta 
tcatggagga 
cagaaagcac 
gaagagggac 
ttcactctga 
cttaccacct 
acaaagctat 
cacacacaca 
ttattgtaca 
atatctctta 
tggataaaat 
atcccttttc 
ttatttttcg 
tctttctgct 
aatattttta 
tacttgttgc 
atcatccatt 
ggacatccaa 
ttgccaggta 
aactctgagt 
caaagaaacc 
tttatttgat 
gcgtgtgggt 
nnnnnnnnnn 
nnnnnnnnnn 
ccacagccac 
tgccctccct 
tcatctctct 
cttgtgaaat 
cctggctgtc 
gcctctgcct 
tttaaacaga 
ggagccggcc 
aatcccagta 
aatagggaat 
tcattttagc 
attattgggg 
agtgctttat 
agcagcctcc 
atgctaacgt 
ttgattagtc 
ggcctgtttc 
ggaagaggaa 
cagtccactg 
acaccaggaa 
gacatctgac 
ctagccacca 
cttaggagcc 
cagcttgacc 
ttatcttcct 
taaacttccc 
tcacccagaa 



caagctgagg 
gctatagatg 
caacatgtgc 
ataccaaaca 
cagtgtgaat 
aaaatggcgt 
gagaactgca 
ataatggtgc 
aggaggagcc 
caagaggtct 
gagttttgat 
tctctaacct 
cacacgcttt 
tttggtcaat 
gctcccacca 
ttgcatcctc 
acatatgttt 
tttgttcatt 
tctgcctcca 
tttaaaaaaa 
tctttcagag 
tcaggggacc 
acacataaaa 
atggtggtac 
tcaaagccag 
ctacctctga 
tagtttactt 
gtgttgttat 
nnnnnnnnnn 
nnnnnnnnnn 
cacagccacc 
gctctctatg 
agcctgtctt 
tttttgcttt 
ctggacctca 
tccgagtgct 
ggctgggata 
ctgtgatcag 
cttgggaggc 
ttgcagccag 
ttatcagttt 
ggaaaatagt 
tttgacattt 
gatagtaaac 
cccacttcct 
tctacttgca 
ggatttctgc 
gtgggtaaat 
gggacttctt 
ctctcagggg 
tcctaggacc 
ttctgcatgt 
cagtgctgca 
caccctacct 
tgtgtgcact 
attgtctggt 
gggtggagat 



gatccatgtg 
ctatatagtc 
tgtctctgct 
caagtaaaat 
acacttctct 
tgaagcatgg 
aaaaacttgg 
ttggttggta 
ctggagctta 
ggctgcttga 
ctctaggatc 
acatacatcc 
ctctagtaag 
tcacctcgcc 
agtcatcttt 
agtattttgg 
tcatcatttg 
tgtttgagat 
gtgctggaat 
aaatcttttg 
gatgtatgtt 
cactaggcac 
taaacaaatc 
atgcctttca 
cctggtctac 
cctttgtcct 
tttcttttaa 
tgttgttata 
nnnnnnnnnn 
nnnnnnnnnn 
accctgctga 
tagttctggg 
tttttccaga 
gttttgtttt 
ctttgtagac 
gggattaaag 
tagctcagtt 
tcctcagcat 
agggctggga 
cctgtgcaac 
ttctcctgtg 
attttttatt 
gctttaaaat 
agtcccccct 
ctcagatgct 
ataatgcagc 
tccagaagga 
aagatggggg 
cttaggtaaa 
ataacctgtg 
attcctaccc 
ccaccctaac 
gagttcactt 
ctgggaaggg 
ttttcctacc 
ctagcatatt 
gcctcctagc 



ggcttttgtg 
ctgctatata 
tttttttttc 
cccctcacct 
caaatgcagc 
tgtgggctct 
caacaggggc 
tgaatatgat 
cccagcctaa 
ctcggggata 
cacataaagg 
tacacacaca 
aaagagacct 
taaaagctca 
gtctcagtat 
tttgcatttc 
cattcctttt 
agagtctcat 
aacaggcgtg 
gggcgggaga 
tgattcccag 
atatgtagtg 
taaaaaacct 
tcccagcact 
agagtgagtt 
cccaatatac 
agggttttcc 
tatgtgtgtg 
nnnnnnnnnn 
nnnnnnnnnn 
caaggctcgg 
gatctgaact 
ttatattatt 
gtttttcgag 
caggctgacc 
gcgtgcacca 
ggtacttgcc 
cataaaagca 
cacacagaag 
acgggcttct 
gattctagat 
ttctaacact 
ggggactaca 
ttcctgccga 
ggatgtagac 
gatgcacatt 
gactataaat 
tcagcatagg 
gctgaatgct 
catcccacaa 
cagctggttc 
ctatgatgta 
cctctcccat 
aaggcccttt 
tcgatgcaga 
cagatcaccc 
tgactggcaa 



acacccggga 
gatgtggcta 
tttaaggcaa 
ggaggccagt 
tgtagaatct 
cagaggagtt 
tggggcagtg 
taaattccca 
caaaatattc 
aaggcacctt 
tggaaagaga 
cacacacaca 
acatgaagga 
tcaagtgagc 
ctgtaagttt 
ctttattaaa 
cttataattt 
ttgtctcagg 
tttctcccat 
aatgtctcag 
cacccctgtg 
cacatacatt 
ttctaaataa 
caggaaggaa 
ccaggacagg 
atatatttag 
gtgtgtgtgc 
gtatgtgtan 
nnnnnnnnnn 
nnnnaccacc 
gtcacatgca 
cagaaaattt 
atttcacaag 
acagggtttc 
tcgaactcag 
ccacgcccgg 
taatgcttgc 
gcatggtagc 
atcagggtca 
gttataaaac 
cttgtgactc 
tttatcatgt 
actcttcttc 
ctcacggttc 
tcctcctcct 
ggtagcagtg 
caagaccagg 
tcctgtatga 
aggtgagtgg 
tatgctcctt 
ttcctctcac 
ctctcccttt 
tcagcactct 
tatttacact 
gccccagctt 
agggtctgct 
ccctgtaagc 



ccactgcaca 
ccatctgaga 
gtagaaagca 
ttctcttcct 
gtaatgtctt 
ctttgaaggt 
gtgtggtccc 
gcactgggtg 
tttaaaaaaa 
ccacactatc 
gaaccactcc 
cacacacaca 
atattcattt 
ctcccccacg 
cctcaagcaa 
aataacaata 
gcttatttat 
ttggctctga 
gctcagattt 
cagctagaag 
gcagcttata 
tgtgcaaata 
aaatatgttc 
aggcaggtag 
caaggctaca 
taaactaatt 
gggagcgtgt 
nnnnnnnnnn 
nnnnnnnnnn 
accaccacca 
cacaccacta 
ataaactaaa 
tctatcctga 
tctgtatagc 
aaatctgcct 
ccctgaaatt 
tagaacacag 
acacacctgg 
ttttcagcta 
aacaacaaaa 
accaacctta 
ttctatgtca 
ccggtgactc 
ctcagcccat 
gttccactta 
acattcttat 
gcagcaaagg 
aggttccagc 
tcagaaccag 
aaaggagaaa 
aaggagagtt 
cccaaaggac 
cctaagagct 
gcgcttactc 
ggttttctca 
tgtgcttctg 
ccacgtctgg 



25740 
25800 
25860 
25920 
25980 
26040 
26100 
26160 
26220 
26280 
26340 
26400 
26460 
26520 
26580 
26640 
26700 
26760 
26820 
26880 
26940 
27000 
27060 
27120 
27180 
27240 
27300 
27360 
27420 
27480 
27540 
27600 
27660 
27720 
27780 
27840 
27900 
27960 
28020 
28080 
28140 
28200 
28260 
28320 
28380 
28440 
28500 
28560 
28620 
28680 
28740 
28800 
28860 
28920 
28980 
29040 
29100 
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tgtgaacagc tgatggcatg gagagacgta tgtctgttct gacttgactg ggatctgact 29160 

gagctggtct tttttctggg tggtcctcaa cagctgatga atgaatctca caagatgctt 29220 

tgcccaaatc tacatagatg agcttgatgg tggcaggcag gtggctgctg tgatgttgta 29280 

gccatcaagg cttagcactg ttggttatct gacacatggt tttctttaag tatggaccag 29340 

tgagaaacag ttcttaaaat taacagattt tttttttcca cataattatt tgaagttctt 29400 

agctcctttg cagggagagg gggggggggc tccaatgagc ttgggcttgt tccctccttg 29460 

cctaccatct caccccagga ctgttctgct tctgatacag aaaaggatct tcttatgggg 29520 

gaagggggag ggacttgtgc ggtcccatca ccccctgaag atctgtcggt ggttaacaat 29580 

accaggggag gcagtcagtt tctccagtgg tgtagctact ggccaggcat ccatgctcct 2 964 0 

gtcaataacc tctcatctat gctctcttat gaacaaccca aactaaactc atcggcatat 29700 

gtgtgtgtgc atgcatgcat gcgcttgctc tcatgtgcac acgcgtgcgc gaacccacac 29760 

tgcataaatg agagataaaa gaggggatta gaaacagaaa gagatcagca ggtatgtgag 29820 

ggggacaaca gagggtcata ggaggtaaat gtgaacaaaa tgtgtgtgtg tgtgtgtgtg 29880 

tgtgtgaaaa cctcacaata aagctgtttt ttctgtataa ttaatatata taaatctaaa 29940 

agatttaata aagagcctaa tccttaaaac catgtttcac tgactggatt ggtccccatt 30000 

gcacacccac tgtcccctag accgctacat gctttgcatt ttctcataga ccctgctttg 30060 

gcagattgcc ccgacctgaa accttaccca agacgcacat aaccccacca aactgtgaac 30120 

tgcagagtaa gccttctcac ccattgtgaa agtcacaact tccccgatct gcccttccct 30180 

ttaggcctcc cctcccccaa ggcccagttc tcttgctcag aggtctcttc cttccctggg 30240 

atggagcttg ggcgtcacag cagcaggtgt aactgcatct gtttgcctgc cttggatctc 30300 

aggctagact gtgtcttgct cagttccttg gagcagcaag aggcgtcaga ggtaactttc 3 03 60 

ggttcctcta aaggctggtg tttatcaaag tctacctaac tgaacttctg gcctccagag 30420. 

gtctggggcg tgccatgctg aggagctgat ggacccatgc tatcatgaca gcttgccaag 30480 

cagagaacac tttgttcttg tccctttccg gctcttctca gcctcagaca atcccaccac 30540 

ccaccacggc aagctggaaa cttcctgttt tggtctgggc ctttgccatc tttaagttgg 30600 

aaaatctttg tccttgatga atgctccttg ccctctgcat tgccccagaa gttattaacc 30660 

tttttgtgac cactgtaatt tcttcaggat tcaatgttgc cttctaatcc tggggtcatg 30720 

cagaaggaaa accaggtata gacaggaagc ccagtccgta ggagatgatg ctgttagaat 30780 

ggctgatgtc actccctatg acacatgagg tgacaaacat ctgagagcta ttctcctata 30840 

aggaaaacaa tgcccccaat cttctgccag tgtctgtggt gtcaatattc ccaataagac 30900 

gcccactaga atggctgctt gggattgaag tcagtggaac catgggtaat aaccctggac 30960 

agtgtgatat aaactcaaat tggaagagct gacaaagtcc tgactctaga ctttgagcat 3102 0 

cttgatcaga gcctcatttt cctagttggg gtttctgttg tcatggagaa acaccattac 31080 

caaaagcaag ttgaggagga aaggacttat ttggcttaca cttccacatc acagttcctc 31140 

attgaaggaa gccaggacag gaacttcaac aggacaggag cctggagaca ggagctgatg 31200 

cagaggccac ggaggtgctg cttactggct tgtttctcat ggcttgctca gtctgctttt 31260 

cttatagaac ccaggaccac cagcccaggg atgaccccac ccacaatggg cttggctctt 31320 

cccccatcaa ccactaatcg agaaaatgcc ttataggatt gcttacagcc tgattttatg 31380 

gagtcatgtt gtagacaaca tttaatctca gtctgggttt ctaccctgtc tttgatcatt 31440 

caattcccag ataaaagaca cacaacatga ttatttataa taagctttta atgcactaga 31500 

gctgggtaga tatctaccct ctaaaccatc tgaatctact tccctaccca taaccccgag 31560 

ttgtcacttg ccatgttcca tctgggccac tcttaactcc aagtggccag catggccatg 3162 0 

tttctataat tcacctaccc catggtaact tctccacatt ccatcttctc tctttcctct 31680 

cgtggttttc ctccaagcct gggaactccc aatccctgcc tgtctcaatt ctgcccagct 31740 

atagactgta ggcatcttta ttcaccaatc agggataact tgggggagga gacaaggtta 31800 

cagagctctt gggtctatgt gcagattctc ttgtccctgg gggcaaccag gccttgggga 31860 

ccagtattta gcattattat acctagcaaa agaccaaacc tccacagagg cattttctca 31920 

attgaagttc cctcctctcc aatgactcta atcttgtgcc aagttaactt aaaaatagcc 31980 

cacatacccc tgtacctcag aggggtcagg caagttcaca gtcacttcca cctgacttcc 32040 

ggtcttttca ggccctaaac ccttactcag agtcatttta attcagtatt ttcttttctt 32100 

tttatcacct ttggggaggg attttttttt ttcccaagac ctcttttaca gacttgactg 32160 

tgtgatcaaa cttcccattt ctctgagctt cggtttatgc ctctgaaaag tggaaggcaa 32220 

ctgcccttag cagatgactt accccgggct tccagcagca actaactctg ctctcggcca 32280 

agcagatcgt ataatcctct tccacagaca aggagagtag tgtagcctga gagcgttaag 32340 

tatcacagcg ttcacactcc aacagcaaaa tgccccaatt agaagcaact taagggaagt 32400 

aaggtttact tggggtcatg gttgagagga tatggtccat catggtgggc aagcaggata 32460 

gtgggaccac aaggcttgcc ttactgtatc cacagtcagg aagctgaagg agatggacac 32520 
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ctgtgttcag ctgcctttct cctctttcat tttttttttt tttatagagt ctagacctcg 32580 

ccccatgaga aatgctgcta cctgcattca gatggtaaat ctccctcagt taatcttctc 32640 

tggaaattcc ttgacaaacc tgttgagaag tttgtctagt aggggattct aagtcctgtt 32700 

aatgtggcaa tgaagattaa ctacaaaaca gtagagcctc gaattaactt ctcttgacat 32760 

ttgacatttg taaattctag caatgctagg tctatgttcc ccacaggcta taattagaac 32820 

taaatggaaa caacttcaac agtagcaact cctcctcctt ctccttctcc ttctccttct 32880 

ccttctcctt ctccttctcc ttctccttct ccttctcctt ctccttctcc ttctccttct 32940 

tcttcttctt cttcttcttc ttcttcttct tcttcttctt cttcttcttc ttcttcttct 33000 

tctcctcctc ctcctcctcc tcctcctcct cctcccgcnn nnnnnnnnnn nnnnnnnnnn 33060 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3 3120 

nnnnnnnnnn nnnnnnnnac tttaaatttg taactactta tttgacaggg tctcattctg 33180 

taaccagccc taaatttttt acaactttct acaaatggtg cttttggcct gggtataagc 33240 

actcaaacca tgaatgggaa cattgtatat ttaaactgta gcaaccctct gaggctttgg 33300 

ttttcaagtt gtcagtttga gagccctgac atgggcaagc taacagaata ggactgtgcc 33360 

cagaaaaatg agtgctagcc cactgcccct accttcatct ccattgggaa ggaagatctg 33420 

gggtctccac acaggctgct ccaaaacccc acccaaacta ccttctcagc ccaacaacct 33480 

tgaattttcc atcaccttct tcacctccat ccactcacac agaacgaccg gacactctgt 33540 

gactgtgttt atgttctcag tgtatccaaa ttattacagc tgagcactgc agatgaccca 33600 

aggcagctgc aggttttata cagaaacctc taaaagtctg taaacctggg atggttttat 33660 

atgatcttta taaatttgac acttttaaag aagatctctt tgaaaaacct ttccctccat 33720 

gtacatgtgt atgtgcatgt gtgcacacac ttgcacgttt acatgctaaa aaaatgatct 33780 

gcattacata aactgattct aataatgctt ttcataaccc atgttgaaat aatttttttg . 33840 

catgtaaaca actatgatat aggccatttc cagcctggtc ttttttttta aataaataaa 33 900 

taatttattt tgaaagtaaa ttgacttagg aaaaatttta aaatagtaca aagaattgca 33 960 

tttttctctc tcatcttccc actgagcgtt ttgacgtata atacactgat cattctctct 34020 

catattgcac tgctgttttt ctgaattatt ggagaggaag ggtctgacag taccctatta 34080 

tcactatctc attcatagcc atattcagcc atcaaaatta ggacattaat gctggcctac 3414 0 

cgtccactcc catagctcca ctcaaatgcc tccagcagtc ttagcagcaa tataaaacat 34200 

cctatttgtg taactatgtg aactataatt tatacacata aaataacttt cctcctggac 34260 

caatgtttag tcgccaaacc tgggggtgct gttgtgttgt gccttccttc atgatattga 34320 

cttttttttt tgtttttgag aacttggtag taattttgta ggctgcttct tggtttgagt 34380 

tcttcagggc tttcctcgtg gttgggtcca gttctgcatt tccatttcca gcaggacttc 34440 

tcagtggtgg tgctaaactt actactcctg ctcaaggtga catgccggct gctctgcgca 34500 

gttgttagag tacgctctga gtgtgaagct aagataacat ctgaaggggt ttctctttgc 34560 

aaaattagat atttgttatt tggggtctta atattccctc cctctctccc tccgtccctc 34620 

cgtccctctc tccctctgct taggattgaa accagagtct cacactattc tgttcttggc 34680 

cactgatcta tcttcacacc ctcaggaatt cagggtcgtc ccaggcatca tagtaagtcc 34740 

aaggccagct tagaccacac gagtccctgt caaaaagaaa ataaatgact gcagtcaagg 34800 

atagcagcac agcttgaaat cccagcactc agaagcacag gcaagaggat caggggtttg 34860 

agtccagctt cggctgcata ataaaaccct gtctcaaaat aaacaaataa aagactgatg 34 92 0 

actaggctga tagctataaa caacttgaat gtcaactgcc tagcataccc aaggctctga 34980 

attcaacccc taacaaagca aataaaataa aactagaaaa ttatacacac acacctgttt 35040 

tgtggaaata atccgagact atctaaatgt gtcatctctc aaactctcac ccacctggtc 35100 

ttagcatccc ctggtgaagt gcgtctgcat cagtagttac catgatagct gccaaatgct 35160 

gactgcctct tctatcattg gtcccacagt tattaattga catttcgctg tagggaaagc 35220 

ttcccttctg atttatttat gtaaggataa agttgtgaat tctcatttta cccagtggat 35280 

tgtaactatc ttcattattc atttcaatgt tcagagtgta ctggctgctc ctcctgctac 35340 

acttggtggt tcctggctct ttttgacaaa ttgtcttccc gttcaaagca tggcttttct 35400 

ccctagaacc tactagtgca gtggcccttt aatatagttc ttcatgctgt ggtgaccccc 35460 

aatcataaaa ctaatttcat tgctacttca ttaactataa ttttgctact gttatacatc 35520 

agaaggtaaa tatttttgaa gatagaggtt tgccaaaggg gtcacaacca atagtttgag 35580 

aacctctact ctagaaatat atatctcagg ttctgctgtg ccttcccagc cccagattgt 35640 

aggctcttag ctgaatttaa gaacctctgg gtctttgtaa tggggaaagc acccaagatg 35700 

tgaacagtgc tcattgctgt tagtctctgc cctctcagca gacagaacca agaaggtggg 35760 

ggaggaagtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tatacaattg 35820 

atactgttac ttaccattca aatgccaaac aacagaattg ctctgcaact ccctccctag 35880 

cctggcttgc atgcttatgc ttgttcctta gcaggtnnnn nnnnnnnnnn nnnnnnnnnn 3 5940 
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nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3 6000 

nnnnnnrmnn nnnnnnacca tgaccccagc tcggaagttg gctctagcta cttcaaaggc 36060 

gacgaaacaa tgaaactgtg ctaataacca gaggaaaggg agctgggcat ggtgggtcat 36120 

gcctataatc ccagtacttg ggaggctgag gcaggaggat ctcaagttcc actctagcct 36180 

gagctatatg gttccagagc aaaatccttt ctcagaggaa gaaggaaggg ggtgagctgt 3 624 0 

cttctttggg gatatccttt agagaaacca ctctcagccc tagaactaaa gacaggtttc 36300 

tgagtctaaa gcagtccccc aaactattct cagcccaagc tgcccctttc cccaggcttc 36360 

tgggtctctc tagctggcca cgcctcctca gtctattagt gttctagctg taccagatcc 36420 

ctgtccctgc ccctttcccc tttcttccac cctacctccc tcattcaagt acagactctt 36480 

accctgtact gctgccaaag tgatgtcatc ctgaccagca gtggtttgca cctgtaatcg 36540 

ccacactcag atgatgagtt tgagtcaagt ctgggctata atgtaagaac ttgaatcaat 36600 

caataaataa aagttaaaat atcaatccta catgaatgca caccccagct cttgagtgct 36660 

cccaaaagct ttcaaaatta ttgttaaaaa taatagtaag ttctacaaaa tcccagaggc 36720 

cccagaaatc atgcgctctc cccttcaccc acatgaagtc tttttggcca aggcagagga 36780 

ggtgagtaga cattgagaag aaggcagacg gaaggtagag gaaacttcct gatgactgca 36840 

gagataggag agggcagagg ggcctgtgga gagagactca ggagaagctg ggtcctagac 36900 

acactgataa ctgcataggg agatggcaga ccatgctatc taaaacacaa aggcctggtc 36960 

aggccagctg aactcctgag ctgcccctgc tgccaccccc cttccacagt cccccttccc 37020 

cttcccctgc cccctgccca gcacagctct cagatcttct gtgtaagttg tatctgtgga 37080 

acacccccaa gcttggactg agtctagccc ctgcctgtcc ctcaatcacc gtcttttcaa 37140 

ctggcagact cagggtcctc actaagaagc agaaataata aacacgatca ctggcacccc 37200 

tctgcccgca gacaaatgtg ttttcaccct gctttcaccc tgctgcctaa gccttgtctt 37260 

ctgactttct ccttggacct gaagagggtg ctttgatttg gtgtgaatct gcatggagct 37320 

ctttggtttg ggttccccac ttcccacttc ccaccatcta taaaatctcc atggtaactg 37380 

aatgggccgg cctcagaatg gctaagtagc taaagatcgc cttgaatccc tgatcctcgg 3 744 0 

tctctgcctc ccaagggctg gcatcttggg cctgcagtcc tgctctggga agtcttttca 37500 

acctaaaagg aaggctaccc caaccttcag gagtggccac ccggagcacc acacttctgc 37560 

tcagcggctg gcagacttcc tcccctgtgc ccagtctatt tcagttctgc atttagagat 37620 

gtcagacagt gctgacatta aaactatccc ctgggatgtt tcagcttgtc tagggctctg 37680 

accgcccagg cctgctcttt cctgctcccc acccccaccc tgcccccagg gtattttcag 37740 

tcatccccag gactgcatct gaaaaagcag ccagcagtct gaccaccagc tgaccttcag 37800 

atgaccttag tcttaaaggt accagtgccc tccctcctct tcctcaaacc tggacccccc 37860 

cagcctgaag gccaaccaac cctccatacc caaaaacctt tgtgcatggc tttgtacctt 37920 

ctgggatatt gctccctggt agatgcagat gccaagagat agaacaagat tttccctata 37980 

agatctagct gggcagtcca ctcccctcca cctccctgca ccttttatnn nnnnnnnnnn 38040 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3 8100 

nnnnnnnnnn nnnnnnnnnn nnnnnnnngt tttatggagc agatgtatct cagaacagtg 38160 

ggctgtggtt atgtaccgta ccagtttccc ctgtcaacag acctggctat aagacagggt 38220 

ttttcaaggc tcttgtgaag ggggggggtt ggtaaagtag gaaccacagt tatatcccaa 3 82 80 

ggcagagtga gctgtgcatt caaagcttga ctccttgggt tgttgtctat aaaggcctta 38340 

taaaaaacat tgcttctagt tactctgtgt agacatatta actaatcaga gacaaggact 3 84 00 

atatttcatt tcgtgaacag aactaatcct tactttattg tcttgtgtgt gacttttgat 38460 

tctgaacata ataaattcat gtccagtctg ttgattaagc aaatcccatg gagaggtcta 38520 

tcagcctcaa gcctgggata tcccagttac aagcattatt ttaaagtagg aagagcctga 38580 

ggctgagggt ggagcacagt gaaagggctc gggccacacc atgcaccaca cccgctcatc 38640 

ttcaggttct ctctctcttg gatcaaactg cctgccacat gctatagtta atctgttaag 38700 

gagagaagga attcatgcct gccaagtgga agaggagagt caccggggag caggggcctc 38760 

acggtccatg ggtattgtgg ggtatcattc ctaaggggca cagtgagggt acatcagtaa 3 8820 

acagctcaga aggctgttag aggcccacag ctaactcgag acctttgggg ttccttctct 38880 

atttcaaaat gacttggtgt atcttttaag gaaggcctta gctgcagaat gaggaacaaa 38940 

agctggcggc ctagacgcta agaatccaga gccacgcccc tctgctggca gggaactatg 39000 

aactaggcct tgatctgagc ctcatgttcc atgtgacaca caggtgcttc tctgccatca 39060 

tgattgtcct ctggttcatt cactcctttg agtttagcca gtgagagccc cagccaggtg 3 912 0 

gtaagacagc gagaggaaga tagtgaagaa gcgggggtgg gtcctctggc actggctccc 39180 

tcctgctggg agttctcagt gggctccagc actatccttc tgctgggagg ttccaggggg 3 924 0 

gtaggggtaa actgcttccc tccgctgtta gctctagaac tcttcctctc tacctaccaa 3 93 00 

actctcccca aatgccagtt ttatatggcc atctgcttcc tgctaggact atgactaatg 39360 
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cctatgacac 
tgggattgtt 
agcccaagca 
agcaactaag 
tacggccctt 
ggtcccagtg 
ctacaaagtg 
aaaacaaaac 
ccataggcca 
actgtgtcaa 
gtaaccccac 
gagatgtagt 
gtgcaacatc 
tcctggccca 
tagtggggtt 
ggccttttaa 
tctctggcct 
aagacaggct 
• tttgcttccc 
gagctgtccc 
cctttgcaag 
tgtctcatcc 
taaataaata 
gattaaagtc 
agccagacac 
tcctaggagc 
gcatggggta 
aagagccact 
cctattacaa 
taataagtgg 
agaggaaagc 
ctggaggagc 
ccggaagcct 
cctccatgac 
agcagtcaca 
ttcctctgct 
tgcctgccac 
tctgggtttg 
tgcacgcccc 
cactccctga 
ttgatatttt 
ttctctctaa 
ttacatctta 
gagaatattt 
gagattgagc 
tgggtttgct 
gtatatatgc 
acacacacac 
tggagagcgg 
gagactgttc 
gttcagagac 
aaaatacctt 
gatacagtct 
atgtccagag 
cagtcctgga 
tcagtaagcc 
caaattggca 



ctttcttgac 
gggagaaaag 
cacagaaggt 
ggggtggcgg 
tgtaacgggg 
cttgagagag 
agttcaagga 
aaaaaggcaa 
acctgatcta 
gttcacaatt 
agaatcaggg 
atgtagaccc 
tcatgtggaa 
cagtcagtac 
tattggctac 
gcccatctta 
gttgggaccc 
gcttcctgct 
ttaggagggc 
tgagccccag 
gacatttctc 
ccatctgtct 
agtggatctt 
caggcatgta 
ccgtgaatag 
aatgtgctga 
tcatatggcc 
gacaccatca 
aggctttacc 
gaagacagtc 
ctcccttctt 
ctgcttcccc 
gacccaccta 
aatgcaagca 
actgcctcca 
tgcttaatac 
ggccaccttg 
gttctggctt 
ctaagcccca 
tgagactcct 
cattctcttt 
ccagccctga 
tcaaattatt 
tggaaatctt 
agcgtggggt 
aagctctcta 
tctctcacgc 
gcacacgcac 
gaagcacgca 
tcaccaaagg 
aagtctgaga 
acaaaagcaa 
acagcaatgg 
tcaggaaacc 
cctcagacca 
ccacctagag 
agattcctga 



cacattttac 
ggggtacacc 
atctaaccta 
gaggaaaggt 
aggtgaaagt 
acagaggcag 
cagccagggc 
tactcctctc 
gacaattcct 
aaaactacaa 
gttgccaggc 
cgctcctgtg 
ggggacctct 
actgccacct 
acagagccgg 
tgaagacagc 
atatctccac 
ggcccacatg 
ccttgtatcc 
ctcctgggta 
tgtcaaatgg 
catcagttgg 
gttgctctga 
gcttgaaggt 
aggcatacag 
cttctatcta 
agacagcaag 
cgtaaacaca 
tccaaacacc 
caaactatcc 
agcagcgccg 
atccttgaca 
ggtggtcagc 
tgagcagggg 
gctcctgcct 
acagcctatt 
aacatgtcta 
tttttttttc 
atcgcaccag 
tccccagatt 
ccgaatgccc 
caagaccaca 
cttgttattc 
ggcaggggct 
tttgaggaga 
taaatttact 
tctctctctc 
ggatttttgt 
gagggaggcc 
ccgggtgact 
gcttcctagt 
cttgaggaag 
ggatggcaca 
aggacgaacg 
tgggatgatg 
aagcccaaag 
gtctgtccca 



attttaaaat 
aagccccacc 
gttttatttc 
ttgtttttgt 
ggtagccagg 
gtaaatctct 
tacacagaga 
ctattccctg 
cactgagact 
attcttgcct 
tcctcagaat 
gggagggggg 
aggctcatta 
agaagatctt 
ccattcacta 
attttatact 
catctcctct 
gcataacaat 
tttctgggta 
gtgaatctat 
acccagcacc 
ccttgtatat 
ctgagtacca 
aaagcaaatg 
cttatttggc 
gggaggacct 
ggtgagctga 
cacacacaca 
atgtacatat 
cacacttggg 
ctccaagctg 
tttgacctca 
catcctggcc 
ccagggaggc 
gctccttttg 
cacccaccat 
caattcaata 
ctgctgtata 
tgttcacaca 
cactcaggag 
agtggcattc 
cacactttac 
cattcttgtt 
gccactcagg 
caggggagtt 
aagacaacct 
tctttgtcac 
gtgtaaaaca 
tggagcccat 
cactacagtg 
tgtagcagct 
aaacgagtta 
ggacgggaag 
ctggagtcag 
tcacctacat 
acttgtatct 
ttcacctgag 



agtcgaaaaa 
tctcactgtt 
ctgttgctgt 
tttcactcat 
cagaggtggc 
gagttcaagg 
aaccctatcg 
cccccccaca 
ctcttcccag 
gtcattctga 
tacaggaagt 
cctcttaggc 
ctaaaggaca 
cccacgtgtt 
tccactctgc 
gtgcctccaa 
gcctctgtct 
tgaaatttat 
ctagaagtac 
attatagtgt 
cacctggcat 
gtgcatattc 
caaacagggt 
cctagcatga 
tcacagttct 
tcttgctgta 
gattgctttt 
cacacacaca 
gaatacaggg 
gagttgtgcc 
gctcatgctc 
accctacact 
tgccaggccc 
atggctgact 
gggatccctt 
tgtttacccc 
gtgcagcata 
agataatgtg 
tgcagtcctt 
tcagcccatc 
gaatcgattc 
ctgaagtgag 
tgcatttgaa 
tggtttaaca 
tattgaaaag 
aagccataca 
acacacacac 
catgagcaca 
atggacagcc 
tagaagaccc 
atttgtctca 
atttggctta 
cagaggccac 
tttgttttct 
ttagggtggg 
atggtgagcc 
agcataccaa 



gcaccacctt 
tctacctcca 
aataacaaaa 
aattccacat 
gggtgctgtt 
ttagcatggt 
taaaaaaacc 
cacacacacg 
ataattctag 
ttccagtgtg 
acggtagacc 
aggtgaatat 
ttattgttcc 
ggtttaggct 
ctggacatag 
gagacccctt 
gctgcctgtg 
ctcatgggaa 
aaagcactca 
ggccataagg 
ctctgcctcc 
attgcataca 
cacttttaaa 
gcaaggtcct 
ggatgtagga 
tcataacatg 
cctctgcaac 
cacacacaca 
attgcctttc 
tatgggaggc 
atgcatccac 
ccccatcttc 
tgctccctgg 
aacagctctc 
tagcttctcc 
atccccactc 
gagcctcgtc 
catctcagcc 
aggcactctc 
tcaaatttct 
gaatcatcga 
ttcacaggtc 
ccactcactg 
gtgctcacaa 
aataaacaga 
gcatgagtgt 
acacacatac 
caagtacact 
cgagtccaca 
agaacccaag 
ttggtgtaac 
cagttcaggg 
tggccacacc 
cctctttact 
ttttcccacc 
ttaattccat 
ccctcccaaa 



39420 
39480 
39540 
39600 
39660 
39720 
39780 
39840 
39900 
39960 
40020 
40080 
40140 
40200 
40260 
40320 
40380 
40440 
40500 
40560 
40620 
40680 
40740 
40800 
40860 
40920 
40980 
41040 
41100 
41160 
41220 
41280 
41340 
41400 
41460 
41520 
41580 
41640 
41700 
41760 
41820 
41880 
41940 
42000 
42060 
42120 
42180 
42240 
42300 
42360 
42420 
42480 
42540 
42600 
42660 
42720 
42780 
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tcaagcagtc 
gccttccctc 
ccccaagcaa 
agcctctgat 
ggatgtggca 
atgagtttga 
aataatagtt 
accccacaaa 
ttggtctctt 
aaatggatga 
ttcctactgg 
agcagtagag 
tgtgttgctt 
agaaacatag 
agaggatcct 
gtctaaagga 
aaacaatttt 
aatcccagga 
ttccatagtg 
aaaaagtaca 
gcttgtaatc 
acgttgaatt 
ccacgtgtgg 
gcttggtctg 
ttgatgcctg 
aggacggatg 
ataagtcatt 
tgctttgcat 
tagaaatggt 
aacttgtaaa 
aactggtatt 
acttatagaa 
ttgcaccttt 
gagcttttct 
tgcttgggtg 
gaaggtgtcc 
gaaaacatgg 
attgatgagg 
ggtgacaaaa 
ggaggttgaa 
cagtagcaaa 
aacctgagga 
atgattggcc 
cggcttgttg 
aaagagatta 
cgcgtgaatg 
actatagtga 
catgtaccat 
aaaatttagg 
agacaccacc 
tttgtgaagg 
cctgtagcag 
gttacctcgt 
ttagacatgt 
caggtgtctt 
ccttcctgaa 
acagaggcac 



aaaccatcag 
ctgtgaggac 
tgggtttctt 
ctctaaacct 
gctcatgcct 
ggccaacctg 
agacaagagt 
gccgcaggag 
gctctggtct 

a ggggcatct 

agctgcctct 
cagagtcacc 
ggacaacttt 
ttaagggctg 
gggttcagtt 
tatgatgccc 
atattttaaa 
ctcaagagac 
agacccactc 
agcatgaatt 
tttaagccct 
atatagccag 
tagcacgtgt 
gatagtttag 
gtgacacaca 
tttccttagc 
aataaacaaa 
gtgtgagaat 
cttggtgggg 
aatagaactc 
ttctaatttg 
atcattttta 
gcaaagggct 
gggatatggt 
acattctagc 
ccatgaggtt 
ggctggtgac 
ctgaggagga 
ggactctgtg 
ggatgctttc 
attacagtta 
ccgtattaaa 
gggaactcat 
cctttgtggc 
gataagagta 
gaacaccatc 
aagttgagag 
cactaaaggc 
atgccaaggc 
atctgtaaag 
ccagcacaga 
gacacccttc 
gagacgctca 
agtgttgaca 
gttatactgt 
gaccctccag 
atggtgaagc 



gaataggaat 
tttctaactc 
tcagcaagaa 
gaagatggac 
gtaatcccaa 
agctactatg 
gaactatggg 
actcactaga 
ggctgtctgt 
tggtggggtc 
tccaccaaca 
tccatcatca 
ctgacttttg 
gatgggggac 
cccagcccca 
tcttgtggcc 
aaagtaacta 
aggaacagaa 
aaaaaaacaa 
ttcctatgtt 
tgaggatcta 
accccatttc 
tcttcctcgg 
aggaaacctg 
tctgtcatcc 
ttgtgtacag 
tagtcggggc 
ctgagctcta 
gggggagcaa 
ttggcccttt 
gatataataa 
acagatggag 
cctgggagcc 
cactcctagt 
taaagtgtgc 
cttcacgaag 
ccacctgcag 
cactggtagg 
gtccagcctc 
acaggggtcc 
caaagtagca 
gggttgcagc 
atgactctga 
cttagggact 
gcccatatta 
agcagtgcac 
gctccatgca 
aaccaaaatg 
aggaaaagca 
aggcttctaa 
gacaaggagc 
ctttcatctc 
gcaggtcagt 
tcccaggcct 
gggctagctg 
agggcagcca 
atgtgtctta 



ctaatataga 
aaacccatgc 
ggtagctagc 
acaagtcatg 
cacttaggga 
tagaacattg 
cagacacttg 
gagagtatgt 
agggatagcc 
tgggctaact 
tacaaaagag 
aaagcagagg 
gctgaaaaaa 
caagtggtta 
catggtggtt 
tccatgggca 
agtctagcgt 
aaatctctgt 
acaaaaacat 
taaaagcaat 
atgcaggagg 
aaaaagcctt 
agggttgagc 
cttcaaaatc 
cagcacccag 
tcctcgcttc 
ccggggtgtc 
ccttcaatac 
aaatagctct 
gagattatag 
aaaaatatat 
agaaaaaaat 
actggctgcg 
ctcttcagtc 
tctgtccggg 
ctggaccagc 
taccccgtgc 
aagggaagga 
acgggtgctt 
cgtaccagat 
acaaaataat 
actggattag 
agtcaggtgt 
tctgactctc 
atctctcagg 
tccagaaatg 
cataagtgag 
gtcatggtgg 
tgagtttgag 
aatatgtcgg 
cagtcttgga 
aaaggttcct 
gagttcccgg 
ggctcctacc 
tggattgcct 
ggagagctgc 
gatggccatg 



tgtgactgag 
tcctattggt 
tggccctaaa 
ttaaaatgat 
gctgagacag 
tcttaaaata 
gcccatgtgg 
atctcattag 
atcttagcaa 
attcagggaa 
tcctggtccc 
cacaaactgg 
aaaaaaaaaa 
agagccttga 
cacaacgatc 
ctgcagtcac 
ggttactagt 
gagttccagg 
agaatttaaa 
aatcaagtgt 
gtggttgcaa 
aactaaatac 
caggaggatg 
aagtttaaaa 
gaggcagagg 
aatccccggt 
actcacggta 
caaaaataaa 
gttcaccaca 
gtgatttatt 
actaatttgt 
agttttaaaa 
ggatgtgagg 
tctgctccag 
gcttctgtcc 
tcatcgactt 
ccctggagga 
agggatggca 
ctagactaga 
atctgcattg 
cttatggttg 
acagtcttgc 
agactagaat 
tgaaccacaa 
atgtcacagg 
aagttgccat 
agggtgggga 
gacatgccac 
ctagcctggg 
agaaaaagag 
gagacaggta 
tgcctgacag 
caggttcccc 
tgtccctgga 
ggcctctgtt 
ctcaccatcc 
gtttgaagcc 



tcaaccacct 
ctggtttacc 
gaaatggttc 
tggggcagct 
gaggcttgcc 
aaataagtta 
agggatcatg 
ctgcagcaat 
gggtatatct 
gaccttgaat 
agagtgtgag 
tgatgatccc 
caaacaaaca 
ctgctcttgc 
tctaactcca 
atagtacaca 
gcacaccttc 
ttagccaggg 
caaatggaaa 
ggtggcacaa 
gcttgaggct 
agtaaacaag 
acatattcaa 
cggcagactg 
caggagatgt 
gctgtaaaac 
gggctatgtg 
atataatttt 
cacacaccta 
ttttttaaaa 
atagtaaagt 
atacttctat 
cgtgagtgga 
gtggcctttg 
tcaggcatcc 
ttacaagaag 
ggaggatgct 
agtgtgggga 
gtacctcttt 
aaattcataa 
cggttactgc 
tggcacagaa 
tcctgctccc 
tcttgccttt 
accagatgag 
cagtttcttc 
agtgcttctt 
aggctattcc 
ctccatagtg 
tttccaagag 
tgcagaaagc 
aagcccacat 
ggggaaagca 
gagaatatgc 
tcaaccttct 
tgagggagcc 
tgcttagatg 



42840 
42900 
42960 
43020 
43080 
43140 
43200 
43260 
43320 
43380 
43440 
43500 
43560 
43620 
43680 
43740 
43800 
43860 
43920 
43980 
44040 
44100 
44160 
44220 
44280 
44340 
44400 
44460 
44520 
44580 
44640 
44700 
44760 
44820 
44880 
44940 
45000 
45060 
45120 
45180 
45240 
45300 
45360 
45420 
45480 
45540 
45600 
45660 
45720 
45780 
45840 
45900 
45960 
46020 
46080 
46140 
46200 
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ggggactgcc 
accaaaattc 
tgtccttgac 
tctgtcactg 
agaatctgtt 
aaagccaagg 
gggcctgcag 
ggaatgtaca 
gctgaccaga 
atgtcccagt 
acaatctccc 
tctaaatttg 
agagaaggaa 
attttctgcc 
gatctgccaa 
accaactgac 
gccaggggct 
gtcagcatgg 
gggagcctgg 
cctcactgat 
agctgcctcc 
ttgcaacaga 
ttcagcgtct 
aagccggcct 
ggggagcagc 
caggactatc 
gtcacatttt 
ttaggtcttt 
ctgtggtcag 
ggaaatcact 
gggaggacct 
aggaaaagca 
ccaggatggc 
cttagaactc 
tacatggtct 
aaggagacaa 
tcttgcattc 
tccaacactt 
aggcttgata 
tcctcccacc 
atggaggcct 
tttggacttg 
ccctcggctt 
aggctgtctc 
actcaggagg 
gagttctaga 
aaacaaacaa 
tgttctcatc 
gctctgcagt 
agcacctgaa 
tgaagacggg 
agctccatgg 
gcccctaccc 
ggtctcctct 
ggaatgttcc 
gggaaagttg 
gctggtatgt 



ctgtctcctc 
acaagaaggc 
agggcaccac 
gcatcccatt 
tcctgacctt 
gacacccagt 
ttccagggct 
aatacactga 
gtggccatca 
caccttgact 
agatgaaagt 
tttctgaggg 
ggcggaagtg 
cctagtccct 
tcaatctgtc 
cctaggaggt 
tcagatctaa 
atttaggcag 
acctctccgg 
ggcatttgtc 
cagaaacatt 
gaacccccga 
acagagcatg 
atgcctagga 
agacctgctt 
tccctgcccc 
tcttgtgaga 
ctgccctgta 
atggcacctc 
ctacagggct 
ctccatgcca 
atgtgactat 
accaccaact 
catcaagtgc 
cctgggatat 
tttgggtccc 
ctggccctaa 
gggaagtcaa 
ctgaaagatc 
ccacccccaa 
gagtgactcc 
gcatgagggt 
acagctgtgt 
tttaaaacac 
cagaggcagg 
acagccaggg 
aaagaatgac 
ctaagaggta 
cccacctgag 
agccatccag 
ctccagcaac 
gtaacggaga 
acccaggagg 
ctgtccagat 
aggcttgttg 
atcatttgtt 
atgcatgctg 



tgcctgacct 
cacaaggcca 
tctctcttcc 
agaggcaaac 
gacttcacct 
gcctggacca 
caaaataggt 
agaaaaaaac 
gggggcaggg 
ttgtgcacat 
caagttgttt 
tctagaagag 
agagacagac 
gactatactt 
caaggccagc 
acttgctgag 
gggtctgaaa 
gaagaatggg 
cttgtggtag 
cctgctgttc 
cctatgtctg 
gcccctgagg 
gataccagtg 
cccagcatct 
attcctagta 
caagtcatgt 
ttttatgatc 
agtacaagat 
cacacccacg 
gctcctgcct 
cagccacacc 
ctacgtgctt 
tatagcttca 
taagagaaag 
gtgacagtgg 
ataccaggcc 
tcactcacag 
ctggatgttg 
aagagttcaa 
ccctagaaaa 
aacccctcta 
cagacctatt 
gaactaagga 
aaaaagtcag 
cagatttgtg 
ctacacagag 
catttcccag 
ctgtactctg 
gggactttca 
gattatctga 
ctccctcacc 
gccctgagag 
atggccacag 
gtctctgcag 
gaagctcttg 
tattcctttt 
gggatgtagg 



ctagagatca 
catggcagcc 
ctcaggctac 
ccttcaatgt 
ccccaaagct 
aggcatctgg 
tcagatctgt 
ccttgtttct 
cagggcttgg 
gaaggacatt 
gatattttag 
ggaaaggaag 
agacagacag 
tctagacccc 
tatcttgaag 
cagcatggta 
gcgaggtcac 
aatagtgggg 
aaatgcggaa 
ttttagtaga 
ccgggcccag 
tcacccggct 
ggtgagtttc 
agcgtccggg 
aagttgaaag 
ttgttcttag 
ttcttggagt 
tgcttttttg 
gtcgcccacg 
gttgtactac 
ttagtctgag 
gctcgctccc 
ctaatggagt 
gctgactgct 
tgaggggatc 
aaatccaggc 
aagtcaggca 
tcttatacct 
agccagcctg 
atataaaatt 
tactgtagtg 
tgttatctct 
gctagagaac 
atggaggctg 
taagtttgag 
aaaccctgtc 
cccaattacc 
tggcctctac 
ttatttcttt 
gcactcagct 
tgaagaagct 
aggggtgggg 
caagagaaag 
cactcacagt 
ctctcataga 
agggtattgg 
gcctcatgtg 



gcagctgagg 
ttgacctggt 
catgcttttt 
gtcctataaa 
cctggtcacc 
tccatcttgg 
ctcccacagc 
gtcacctgtg 
gcaggaagtc 
tgaaagggtg 
gtctcagaag 
cctgaggctg 
acagacattg 
gagagccaag 
gtgtccctgt 
gccagaacaa 
ctcaagaggc 
cagactggga 
cagtgctgta 
aagtgtcatg 
cgaggccaag 
gagtctctcc 
tacttggagg 
catgctcagg 
ccaatgtctg 
tgcatttcgc 
atattttaca 
tgtagaaggt 
ttcctgaagt 
atcccaccac 
aaatggtagc 
cacatgcctc 
acacagtgct 
aagagccaga 
aggtaacaca 
ctgtggggac 
tggtggtata 
atagtgccag 
ggctacatag 
aatactacat 
ctccaggtca 
aggctagctt 
tggccctcca 
tatgcacctt 
gccagcttgg 
tccaaaacaa 
cctccaacag 
ctgactttga 
gtttctcagg 
cctcctggat 
gatgtcactg 
gagcttcatg 
tgctcattag 
aattggccca 
atctgagctc 
gggggcacgg 
tgctaaatac 



tatagctgag 
tctgtagcta 
ctcagcttgc 
ggctctggag 
tgcaagaccc 
gtcttctcga 
cagagggaaa 
tagtgctatg 
ctgagtctaa 
ggcaagggtt 
ccctttgctt 
gagagactga 
ggtgggcttg 
tcatcttgct 
gcttctcaga 
caagaccaca 
tccctatgta 
cctggaactg 
cggacagcat 
tcaccacctg 
gaccttcctc 
gagacactgt 
gtcccctggg 
cctggactgc 
tctttctagt 
tattaagatc 
ttcataactt 
cactgccgtt 
gtttctcatg 
aggcagctgc 
cccaaacccc 
ccatggggct 
aagtgacatt 
gcccagggag 
gcctaactgg 
cactatctgg 
tgtctgccat 
cactgagaga 
gcagatccta 
tgacttccca 
cctttgagag 
gttcttccag 
cgtgtttctg 
tgatcccagc 
tctacagggt 
acaaataaac 
cacccaaaag 
tcaattcctt 
cttcccgagg 
tccgactttt 
ctctgcaagg 
ggtaatggga 
agtgaccctg 
ggtggagtct 
taactgagct 
atgtatgcat 
atgtgcctat 



46260 
46320 
46380 
46440 
46500 
46560 
46620 
46680 
46740 
46800 
46860 
46920 
46980 
47040 
47100 
47160 
47220 
47280 
47340 
47400 
47460 
47520 
47580 
47640 
47700 
47760 
47820 
47880 
47940 
48000 
48060 
48120 
48180 
48240 
48300 
48360 
48420 
48480 
48540 
48600 
48660 
48720 
48780 
48840 
48900 
48960 
49020 
49080 
49140 
49200 
49260 
49320 
49380 
49440 
49500 
49560 
49620 
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tctgtaatgt tttcttgttt gtttttaact caaattaatt agaggcagtt tctctgttta 49680 

aaaaaaaaag gtggaagaca ctgccatctc atttgtgttg ggacttgaga tatatatata 4 974 0 

tatatatata tatatatata tatatatata atttatttat ttatttattt gtgtatgtgt 4 9800 

gtataagaaa gcgggggaca ctcatgtgcc atggtgtgta tggggacaca cgtgtcctgt 49860 

gggggaacat tgtgtcaggg gaacaagcat gtgctgaggg tggggggata tatatgtgcc 4 992 0 

actgtggtga tttggttggg ggagacatgt atgtgctatg atacctgtgt gtaggacaga 49980 

ggacagactc acacaggagc cttcttacct gctgagctgt ctcagcagtc caagccctcc 50040 

aggctgtagc cactcttcct ccttgctaca gtcccacatc cagccaacac ataaaggctc 50100 

tggcaggaaa taaattaaat ttgctttgtg tgtgggtgct gacggctcaa gtcttccggt 50160 

gatggtaatg tttaaagcaa gccaacatca gttctccagt gccgaagtta ttgaatgact 50220 

gaccaatggg taacacttag gatttttaaa aattattttg attgtacatt ttgaattaca 50280 

ttaatcttaa aataaactac aaacataaac acactgtggt tggagctggg catagtggca 50340 

catgccttta attccagaac ttgggaggca gaggctggca tatctctatg actttgaggc 50400 

cagcctggtc tatataatga gttccaggac agagagatcc tgtctcacaa acagagaaaa 50460 

ccccataact atacttttat tgtactggtt ggtctacagt gtaaagaatt ggcaaagaat 50520 

atgaaagatt acactgggaa gaaactgaaa gccatccaga gtaatgaagc aaacccctca 50580 

caaatgtggg aaacatttca catgggtgag ggcttgcagc cagcctgtgc taatttatgt 50640 

tttggtacgt gagcacttag agcagtttcc agttctctgg tgctctacca atcttagctt 50700 

ggttaatatt cagggatgag tcttccatca ccggtaatct aatttgccat tctatttaaa 50760 

ggctcttaaa ggcacaggca gtgcattggg taaatgtggc aaataatttc tttgacaaat 50820 

cgaccaattg tcagattggc ctgctagtca tttgtttcaa tgagaactgt tttttctcaa 50880 

aggatgctct tgtacaccgg ctagaagcag ggctgtcatt tttataggtc tctgtggtat 50940 

tttgttgttg ttgttctgtt gttgcaaatc atttatcact gagggaaaat acacacaaag 51000 

ggccctttct ttaaaagtat acatgtatca ttttgtgaca gctcataaga agctgttttt 51060 

ttctgcctgg acacaggtcc tgacctgtgc tgtgtccttg ctaagctttg tcagaccctt 51120 

ccacagctcc ccccaacaac gagttcccca gtacctgcct cacctcatca ctatggtgac 51180 

agcagcctct gatgcgcctg actctctggc acattatggc agtgttaaaa gcttccatct 51240 

ctctctttgc tgaataatga acctcaggtt gttcaagaga ccggaatgtt cttcacctgc 51300 

ctgcacacat ctcttcactt tcttttatag atcaggtagg gactgggcgt gtagatggaa 51360 

caaactgttt tccgttcccc agccatctct gcaggtgcac tccacataaa tcaagtgtta 51420 

aaagtgcttt gattaaacag gacaggcgcg ttcttgagtt catctgttca catactgtct 514 80 

ggcaagcgct gactgagggt ctcctctgta ccctgttctg agaactaaca aaagacgaat 51540 

caacatacag aaaactgtta tttagtgact gattaaacta acgaaggcat gggctggaga 51600 

aataactcag cagttaagaa catttgctga tcttgcagag gacctgggtt gggttcctag 51660 

cacacacagg gacagttcca gtcccggtgt gtccttttct cacttctgtg gacacaagtt 51720 

ttacacatag tgcacacaca tacactcaca tatataaaac agaacattta aaagtatgtt 51780 

taaataacgg aatcatttat ataggttttc atttacatag gtaaataggc aaaaatctgc 51840 

attttattgt ttctaagttt taatttattt ttctctgtgt gtacatacgc atgcctcctt 51900 

atctgtatgt gcgtgcactg tgtgcatgca tgaacccaca gagaccagaa gagtaccaca 51960 

gattctctgg agctggagtg attgataggc tgttgggagc cactccacat ggggattcag 52020 

agttgaactt cgttctctgc aagaacagcc agctcttaac tgatggcttt tacctccagc 52080 

caccttttcc tcatttttaa aatttccttc cttccttttt gagacagggt ctcaatactt 52140 

agctcatccc aacttgaccc cactcttctc ttgccttagt caccacaatg tttagtttat 52200 

aagcatgcgt cactatgccc ggctttaaat aaactcaccc ataatcccag cactgaagta 522 60 

gacaaaaggg aggatcgatg gggctgactg gccacaagcc gtgcttcaag ttcaatgaag 52320 

accctgtctc aagggaataa ggcacagagg atagagccat acgcctgacc tcctcctctg 52380 

gcctctaccc aggcacatgt gcatacacac accacacaca cacacacaca cacacacaca 52440 

cacacacaca gagagagaga gagagagaga gagagagaga gagagagaga gagaaacttt 52500 

ttcctctttt tttttaaaaa tattatttat ttcatgtata tgagtacact gttgctgtct 52560 

tcagacaccc caaaagaggg catcagatcc cattacaaat ggttgtgagc caccatgtgg 52620 

ttgctgggaa ttgaactcag gacctctgga aaagcagtca gtgctcttaa ccatctcttc 52680 

agccccacaa agaaactttt aatgagcaaa taattgcttc caagtaaata ctactaatat 5274 0 

atttctaacc atactataca aggaattatt aaagaacgga taataggaga ataaaaaatt 52800 

ataagtcact ttataatgct atctaatcca tctagaacaa aaacactgta ataatgcaaa 52860 

agagcgcagt gcctagatta aataaataaa atgcagacca ataagtaaac tttatagcag 52 92 0 

cacatggaaa tgacgaaatt cctaacaaaa agctcaagat gggcagttta tttaaagtga 52980 

aatacaggag aaataaagca cagaaagata ctcaaaggca tagaagttaa catagggggg 53040 
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ctggcgagat 
aatcccagca 
ggtgtgtctg 
aaaaaattaa 
acattattat 
acacaaatcc 
taatcctagc 
cccctccaga 
cgggtgggga 
taaagaccaa 
atgctctatt 
ccctctcctt 
ttgcattgat 
aaatgcaggc 
ctagcctggg 
caagatgttg 
ttggaataaa 
actataggaa 
aaacttctag 
gttaaatatg 
atatatattc 
aaaggctttc 
ctatcaatag 
ccacccagca 
agggtttctc 
ctctgcctct 
ctctctctct 
gcccctctct 
ctcctgagtg 
aaaagaattt 
actcctccca 
tacatatgtg 
acatttagtg 
ggctcagtgg 
taaaggctgg 
acagggaatc 
actggccagc 
aaggcaccta 
acacacacac 
aataaataaa 
gtgtgtgtcc 
ttgcgtttcc 
agatacagat 
ccaactgatc 
tgtggcctat 
ttgaacttta 
gtctctgcag 
aaggggtttg 
gcaccgcctc 
ctccatgccc 
ccatcagaca 
tcctcagaaa 
tgcaaaatgg 
aatgaagcat 
agggaaggga 
gtgtggtggc 
gtttgaggcc 



ggctcaacag 
accacatggt 
aagacagcta 
agaagttaac 
ggacattaaa 
ttgagtttca 
acttgtgaaa 
agggagaggc 
cagactgttg 
tcaatttaaa 
ctgcccttag 
caggtctggg 
cacagctcca 
aaaagaatcc 
ctccatgaga 
cagcagcaca 
gcctcaatca 
caactgtgtt 
acattgattt 
taaaaacaag 
atatatgtat 
tgacaataac 
gcttgacact 
ccccaaattt 
tgtgtggtcc 
ctctctctct 
gcccctcgct 
cttcccctct 
ctgggattta 
cctatgtgac 
aatcccctct 
tatatatgtg 
ttattcattg 
gcaaaattct 
ctggacatgg 
cctggagact 
tctgggttca 
atgtcaacct 
acacacacac 
atatttagct 
tacaaacact 
agattgaaaa 
tgaaaaggat 
tttgccgggg 
ctgctgttag 
ccttgaacct 
aggttgtttg 
gatttggaaa 
ctcacgttgg 
ttttgtgaag 
aattgcctgg 
tggaaaggca 
tggtgcccct 
agatctagct 
cctagatctt 
gcacaccttt 
agcctggtct 



gtaagagcac 
ggctcacaac 
cagtgtactt 
atagaagccc 
aagagaaaat 
taccaaatgc 
aggtcagctc 
taattaacat 
cccatttcca 
gggtgcactg 
aaaccgtata 
acaatcccac 
cttcgtggta 
gaaactcaag 
ccctggggga 
ggcagcctgg 
tgactctccc 
cagaacacta 
gggaagatct 
gacagttttg 
atgaatgaaa 
agaaagagaa 
ctttagctgc 
ggattattgt 
tggctgtcct 
gcctctctct 
gcctctctct 
ctgcctctct 
aaggcatcag 
tactgtattt 
acccactcaa 
tgtgtgtata 
ttgcatgttt 
agctgcacaa 
tggcttgcct 
tagaatctca 
tcaagaaacc 
caaaccccta 
acacacacac 
ctccagacca 
gaaggttaag 
cagattctat 
acagattgaa 
cttgtccttc 
gacctgaatt 
ctgctcccag 
accaacagct 
gatgcaattg 
ctagtctaat 
gcatttcctg 
tgttggagga 
aggaaaacat 
cctccacagc 
attttttttt 
tatgttatgg 
aatcccagca 
acagagtgag 



ccgactgctc 
catccgtaat 
aacatataat 
actcaggacc 
tcagcagtag 
ctttagacca 
aggaactttg 
ttctcagacc 
gactagggaa 
ttccgccaat 
aaaactagcg 
tacactggaa 
agctaagact 
gtcatctaaa 
ggggcagagg 
ccaccagtgc 
agttttataa 
tttataatag 
cttggcagct 
ttttttgttt 
acccaaactt 
atagagaatc 
caggagagct 
tttattttat 
ggaactcgga 
gcctctctct 
gcctctctct 
ctctgcccct 
ccatcacttc 
aaatcaccac 
attcttatct 
tatatatata 
tcaatgtgct 
gcctaaggac 
atgatactag 
gaagtgatct 
ctacctccat 
cctgcatgtg 
cacacacaca 
aatcttggtg 
aagcatgctc 
aggctacaca 
aagggtcggg 
agggaagggt 
gcctggagtg 
ggaagtcatc 
ctccccaggc 
ctataggagg 
ataaacatcg 
gcatcagctc 
ggaggtgagc 
gaggttcttc 
tgctcacggc 
tagtgccttc 
cattgttaaa 
cttgggaggc 
ttccaggaca 



ttccaaaggt 
gagatttgac 
aaataaataa 
ccactcagtc 
tgtgcatgca 
cttgtggctc 
ggaaggtcat 
acagggcggg 
gtccttgtca 
catattgtgc 
aaggggtacc 
caataaattc 
ccctggagtc 
actacatagc 
gagaccgttc 
tgtcaccaga 
ttggaaataa 
caaagatctc 
tattttgaaa 
tgttttgttt 
aaaattcccc 
cataaaaact 
gaatctgaac 
ctttccccta 
gatcctctgc 
ctctctctct 
gcccctctct 
ctctgcctct 
cagcttcctt 
acggccaata 
tgtattcttt 
ctatatactg 
ttccaggagg 
cagggttcag 
catgcttgct 
gggctggaca 
aacataaagt 
cacacacata 
cacacacaaa 
aaacccatgc 
cttagtaatt 
gtgctaaatg 
gtctgggcca 
tacaggattc 
tttctagttc 
aggactctgc 
cttcgcccac 
gactctgaag 
cggtggatgg 
ctgacttcag 
agggccattc 
agacacttaa 
ggggcaggag 
agtaaattta 
agtgagaact 
agaggcaggc 
gccagggcta 



cctgagttca 
tccctcttct 
aactttaaaa 
ctagagtatg 
ctgcatatat 
tgcaaacctg 
gaaactcttg 
aaccgacctg 
cctcattccc 
ctagttgctg 

aggggtaacc 

ctcttgcttt 
ttacattggc 
aagcatgctg 
agaagacagt 
catgttaatg 
gaaaggaaag 
agagtaaccc 
actttacaat 
tgttttaggg 
actatgcttt 
agttctgaaa 
acagggaacc 
cccccaagac 
ctctctgcct 
ctgcctctct 
gcccctctct 
ctgcctctgc 
tatcatttta 
ctccccccca 
atcattatta 
ctaatgagta 
ctggggggat 
atccccaata 
ggaagcaaag 
gactagctga 
gtgatggaga 
catccacacc 
taaataagta 
atttgcattt 
ttatagcagt 
gattatgctc 
ggatgacggg 
accactgggg 
ccactagttg 
catccctgga 
gacctcaggt 
gcagacagac 
tgaggataga 
acagtttcac 
ccatcatttc 
tccctgggac 
atgagggcca 
aaatcaaata 
tgtagccagg 
ggatttctga 
cacaggtttc 



53100 
53160 
53220 
53280 
53340 
53400 
53460 
53520 
53580 
53640 
53700 
53760 
53820 
53880 
53940 
54000 
54060 
54120 
54180 
54240 
54300 
54360 
54420 
54480 
54540 
54600 
54660 
54720 
54780 
54840 
54900 
54960 
55020 
55080 
55140 
55200 
55260 
55320 
55380 
55440 
55500 
55560 
55620 
55680 
55740 
55800 
55860 
55920 
55980 
56040 
56100 
56160 
56220 
56280 
56340 
56400 
56460 
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tctatacaga gaaacctgtc tcgaaaaacc aagaaaaaaa agaaagaaaa agaaagaaag 56520 

aaagaaagaa agaaagaaag gaaggaagga aggaaggaag gaagaaagag gacaacatgg 56580 

tctaggggtc agagagcaga atctccaaaa acaccaacaa tgcctgctgt aaatgtatgt 56640 

cgttgatttg gggatgttgg cctccagctc accatttcct gccttagcct ccaaagtgct 56700 

aggattatag gcttgagcca acacatctgg cttacgccta ttgtgtgtgg aaggggagtg 56760 

ctgagtgtgc tcctgtgttt ggtacttata tatgaatata tgtatatacg catgtacgca 56820 

tacttgcatg tgaaggccag aggccaatgt cagctgctct catcttatcc tttttattac 56880 

attgtattta tttgtttgtt tgtttgtttg tttcatcgta cgcatgcagc cactcatgag 56940 

catgtaacag cacaggtatg aaggtagact tgcaggagtc agttctctcc ttctgtcact 57000 

tgagttccag gaccacactc cagcccccag gcctgggctg taagagagcc atcttactgg 57060 

tcctctactt tgtcttctga gatagcatct agactcacgg aacctggagc tcatctagat 57120 

ttacattggc tggccagctg atgcatttta aggtcaaatc ttcattccat ccctacccca 57180 

cttccactcc cagtgctgga gttcgggaca cctgccacca agcccagttt ttcctggatg 57240 

cagaagctcc aaactcaggt tcccatgttc gcatggcagg cacattttca gttaagcctt 57300 

ccccccagct cctttaccct ggtctctgaa tgggggggag gctataaatc aggctgctct 57360 

cagacattag gtaggaaata gaccatatac atgaggaaag atattcacct gccccatggg 57420 

taccaggaag tgatgtccaa ctcctctttt gcttatcagg agaaatgctg actactacct 57480 

ctggtaattt tgatgttggg aggaacaggg acattcatag gaccccattc ctcgctggtg 57540 

agagtggaga caggttttct gaagggcagg agatctgtgt agaaaagatg gatgctgttt 57600 

tctgaaggga aatggaggta gagtcgacct gggagagagg ggaggtgggg gatggttggg 57660 

aggaatgaaa ggaagagaga cggcagttgg gatgtattgt ataagagaag aatcagaaag 57720 

aaaaaaagaa aagctacctg cacccttcaa gtgttcctct gtgtgggagg ctgtctcagg 57780 

gactacatgg gcaccgagag gcatcagtga gggtaggtac ttgatgttgt gtccctgaaa 57840 

acaaggacag gaaatctgct gcatggccta agatggcaaa atgtggcaca atcaagtaag 57900 

gcccaggatt ctgtctgtgg tgcagacctg ctgtagaatg agctcccagc attcccactt 57960 

gctgtgtgga gacagcatgt tgcagagcca tgtgaggatg agggtccagg ccaggaggat 58020 

gtcaacccac caccatgtag ccagtgggct gggggagctt gggcccacca aggagcttga 58080 

gcagactgac agtgggttat gtacacaagt gggcgtgtca cacaaccgtg caacacagag 58140 

aaaatccctg tgatgacaac ttctaaacca ccctgaggca aaaggagtag acaggggatt 58200 

agagcctagc atattggagt cgagtggcca tgcagctctt ggaagcgtga ggaaggaaat 58260 

ttcctggaag gataggttgt cttcctagca gcctcgtcaa tagatgtcaa tgtatgaggt 58320 

agtacctgct acaatcctgc ttcttcagaa gactgaggca gggggattac ttgaacccag 58380 

aagttctagg ccagtatgga caacatagca agagtctgat ttaaaaaaaa aaaaaagtaa 58440 

agagggaaac caaataggtg acgtgccaca ctagtgtctt cctgctccaa gggtcctggg 58500 

cacatgagct tgcttagtgc cagaaaagtc agaggaggag agggcagaca gagaccctcg 58560 

tctccacctc ctttgactga ctaatggggc tggatataat ctgttttaca aaaggacagc 58620 

ttttcagagc tgtttctatc taaggttgct ctgaatagcc atctcgaaat atgccagaga 58680 

agaatattta gaggcggcct attttggtct cccacaaaga tttcacaggg aaaatgtatt 58740 

tgtgttctat ttatacaact aaaaatatgc atcagcccgg ggaaactggc tctttgctgc 58800 

ctttgaagtg aaggatgggt taatttctaa gaaagtaaaa gcaaatgtag tgcaggcacc 58860 

acggatgctg ccagacacca gcgtttaagt ggcttgaatg gaagagcaca ccccaagtat 58920 

ttgaagtagg tgaggagaga gaccaggtct ccagagttgg gcctgcagtg gccagggtaa 58980 

gtccaagggg cagtatgacg cagaacagag tggcaacctc taccagtagt agaattcagg 59040 

tctcattcct aacctcccat aaagcagaga atattgcact ctctttctct ctctctctct 59100 

ctctctctct ctctctctct ctctctctct ctctctaact cacagagatc ctcctgcctc 59160 

tgcttcctca ttgctggaaa taaaagcatg tgccaccaca tgcagctctc cttgtttgga 59220 

gagagaaagg cagacagaca gacagacaga cagacagaca gagtataatg tgtgcagatg 59280 

tccacagaac tccgaagagg gtgttggacc cccagaaacc ggagttctag gcagttgtaa 59340 

gctgtccagt gggtgttgcg aactgaactc aagtcctctg gaaaactgga aagtactctt 59400 

aaccactaag ctatctgtta gtcccccaag aatgtcttat cttgataggc cttcagatct 59460 

cacagtccag ggggctgacc tgctacaagg ggtagaggaa agaaaactgg ctccaggcct 59520 

gagttcagag tgacatctcc agagttcttc cttcccttgc cgtgtagtaa tttcctaccc 59580 

tacacgagga gaaaaggaac agatatgtcc agtatgcctg gcatcttgaa agggcactga 59640 

ctcttgccgc tgtagagccc ctttccttgg ggagtgcaga gaagtgctgc tagagaggtt 59700 

caaaagaaac acaacagtca aaatagttgc tggccaaggg agggcatgtg ccctgtatcg 59760 

ggcctgcaaa gccctttcaa cagtgtagcc aaggccatgt ttgacagtac aagcctgtaa 5982 0 

gcccatgcat aagcaagggc tgggataaag ggctactgtt caagttagtt atatacacat 59880 
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caagtttgtt 
agcaggaaag 
tctttcttcg 
cttgcataaa 
cagtctggta 
gctgtaatgg 
aacactgaac 
tttacacaca 
gggcttgcct 
gcagagcgcc 
gaggaagacc 
tccagagaca 
gggttttaag 
ggcaatgtag 
actgtgtagg 
atctattttt 
tcatgtcctt 
accagcaagg 
cattttcatt 
ccaattgaca 
cctcctgatt 
tcctatgact 
agaatctacc 
agcagtattt 
gaggaactca 
acactaaaat 
agagctcccc 
tcaaactgag 
aacaattatt 
tttttttttt 
agaccaggct 
aaaggcgtgc 
aaaacaattc 
ttctttctct 
tttttttttt 
ggaactcact 
caagtgctgg 
gtttatttat 
gccagatctc 
acctctggaa 
tttttcttaa 
gagacacccc 
cagatttcca 
tctgtctgtc 
tgagtgagtg 
tctcctaccg 
cttttcccac 
tttgtttgtt 
acctaaacct 
atcgtcttcc 
tgtgaatcta 
gtcaaactca 
tgttggtgga 
ttttccagca 
atagcgagcc 
agttgacatc 
agtccctggg 



catcttacat 
catgtccacg 
ttgcttggga 
tctataaata 
tagcaattta 
ccaccaaaca 
acagtatggt 
actcacaaat 
gagctgcagt 
cttcctagat 
agaagtgagg 
ggctttggag 
ctgtggggat 
gagggcggga 
tggggtgaga 
ttggccatga 
gtctacagac 
aaggtatcca 
acccaggtgc 
agtctgctgt 
tctctgcagt 
cgctgacatt 
aacaggcgtt 
ggccgtcggt 
atggcagaga 
cgctgcctaa 
attggtccaa 
ggccacacgt 
tttgttttgg 
tgagacaggg 
ggcctcagac 
gccaccacca 
ttatgatttc 
ttctttttct 
ttttttttgt 
ttgtagacca 
gattaaaggc 
tattatacat 
attacagatg 
gagcagtcag 
tggaccccca 
actgcccacc 
tttcttctta 
tgtctgtctg 
agagagagag 
tgtgggtccc 
aaaaccatct 
tttatttttg 
tccaccacac 
taattattac 
ttgctaagat 
tcagtgaatg 
gcagctgggt 
tgctatggaa 
agatagcctc 
accaaagcca 
cattcctcag 



ccttaggtaa 
gtaaggtaga 
ttgtacttgg 
tatggttgga 
ttagctattt 
ggtcctttaa 
ggtttcaact 
gatacagacc 
agtccgggag 
gaaatccaag 
aggaaaaggc 
cttgcctgtg 
gggtatattt 
aggtgatgga 
agcactgtgg 
ggaatttggg 
ggggtgataa 
gtgcccaccc 
ccggagaggc 
cttccattga 
cccctatttg 
cctgctgtcc 
cccttatccc 
tcctgtgaag 
atagtagtca 
atgcaccagt 
gatggtatga 
tgcctataaa 
gggtttggtt 
tttctctgtg 
tcagaaatcc 
cccagctttt 
cttaaaaatg 
ttctttcttc 
ttttcgagac 
ggctggcctc 
gtgcgccacc 
aagtacactg 
gttgtgagcc 
tggctcttac 
tgattcttgt 
ccaaagccct 
ccaccttggt 
tctgtgtgtc 
agagaatatg 
aagagattga 
tgcctgactt 
ttttcatgag 
ccagcttagc 
cagtcgatgt 
tctttccgga 
aaccatttcc 
tacacatggg 
tataggggtg 
acccgggaca 
gccactgact 
aaaatgcatc 



gggtgggtag 
tactgtagag 
gaccttcact 
gtcaggtcca 
gtctttagac 
gtatctttat 
gtcatggagt 
acagcagggt 
cagctcttga 
cctctggtta 
tcagagtgga 
gttcagccta 
tggaaggcga 
tgatcgagga 
gtgaggtgga 
gtgcacacac 
tggtactgga 
cacagctcac 
cagtcccatc 
agataaggta 
cactgagtgt 
acaggtcaag 
tccggtcacc 
agagtgcttt 
ggccagccac 
gaagtacccc 
acctccttaa 
atggtgtctg 
tttttgtttt 
tagccctggc 
acctgtctct 
tagttttttg 
actggcttaa 
ctttctttct 
agggtttctc 
gaactcagaa 
acgcccagct 
tagttgactt 
accatgtggt 
ccactgagcc 
acttccagct 
cttcatccct 
tttttaaatt 
tgtctgtcta 
aaggaaaagg 
ctccagattg 
ctctactgct 
ataggacccc 
ttttcatttt 
tcaattagcg 
agtagatgac 
caacagttca 
cgagctgggc 
ttatcaggcc 
ttgaccatcc 
gcatgtctta 
tcaaagtgga 



cttgttagct 
tttagctttg 
catgcagccc 
gcatatgtgg 
aagtaactat 
tggccaaaca 
ccacattcta 
gatttgacag 
tgggatctga 
ctcctcagca 
gcatctaagc 
ctgtgggaag 
acaagaacac 
cggccccaga 
gttagaggga 
acacatacat 
aggtagaagg 
cctcagacag 
accatggttg 
ctcacaggtg 
ccaagtccca 
tccttgctgc 
tttgaggtga 
tgatttaggc 
ctcgctgggg 
tcatgggagg 
aataggaagg 
gtacaggaac 
gttttgtttt 
tgtcctggaa 
gtctcccaag 
tttcttgttt 
gtggagacct 
ttcttccttc 
tgtatagctc 
atccgcctgc 
ccttctttct 
cagatgaacc 
tgctgggatt 
atctcgccag 
caacttggtc 
gagaggagtc 
tcacatttat 
tcgtgtgtgt 
acagcttgca 
ccatgcttgg 
cctggcttta 
caaatagcct 
ctaaacatgt 
gttgtaacaa 
acatcagatt 
tctgtttcct 
tcagccgcgg 
agggtgggaa 
cagccatgta 
tgctctgaag 
cgttgagtct 



ccacctctcc 
ttttgacctt 
atgtttcagc 
gtaccagtcc 
agggttctga 
ccagggttaa 
gcaggggaat 
agacaacgaa 
gatctagatg 
agagtagctg 
ctgcagcacc 
agaccagtga 
cagcagctca 
gattttaagg 
agaagtgaag 
acacacccca 
tggaaagagg 
accttgttct 
ccaaactcag 
tcctgtagca 
ggttctgcat 
acgagggctc 
gtgattctgg 
tgtgaggagc 
cagtgggtaa 
ctgcaacgaa 
agaagtgaca 
agtgggagaa 
gtttttttgt 
ctcactttgt 
tgctgggatt 
tttgggggag 
agaaaggtgg 
cttccttcct 
tggctgtcct 
ctctgcctcc 
ttttttaaag 
agaagagggc 
tgaactcagg 
cccaggtgta 
ctgcctttat 
tcacaaaaaa 
ctatcatctg 
gctgagtgag 
agagtggatc 
tggcagattc 

ggggggcttt 

ccttgagtct 
ctgcgtagga 
ggccccagag 
aataagtgtt 
ccaccaaatc 
tgccagccat 
aagcctggga 
gataaatcag 
gtgaagtcag 
gggaaactga 



59940 
60000 
60060 
60120 
60180 
60240 
60300 
60360 
60420 
60480 
60540 
60600 
60660 
60720 
60780 
60840 
60900 
.60960 
61020 
61080 
61140 
61200 
61260 
61320 
61380 
61440 
61500 
61560 
61620 
61680 
61740 
61800 
61860 
61920 
61980 
62040 
62100 
62160 
62220 
62280 
62340 
62400 
62460 
62520 
62580 
62640 
62700 
62760 
62820 
62880 
62940 
63000 
63060 
63120 
63180 
63240 
63300 
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tcgttaagaa 
cacccacccc 
gaatgaccag 
aaaacaccat 
agggggttag 
cgttggatca 
actagaaatg 
caacaaggcc 
caaatatatg 
agccagtccc 
ctgcctatgg 
tagccgccag 
tgtacacacg 
acctcactgt 
ccacccatag 
gtccaaactc 
gaacctcagt 
ccaacaccca 
cttctgtctg 
tggcggagaa 
ctgcccacat 
ccaatacagt 
aagatccttt 
tgagctgaga 
cgagtgcata 
agaatggaat 
gaaaaactcc 
ggtggttctg 
ccttcaaatt 
tctccatctg 
agagcaactt 
ctagaaaggg 
tccctcttgc 
tggattgatc 
aggatggcta 
agcagagagg 
gctcagactc 
agagcatact 
tctccctcct 
atcagtaaat 
gtgacttggt 
atgaatatca 
gcggaggagt 
ggggtggaag 
tgtagggaaa 
tgtctagttg 
tagaatagtg 
cactggtgac 
ggtgctagct 
agatttaccc 
tgtacagctt 
gagagagaga 
tgtttcaaga 
tgagtcttga 
ggttttagga 
gtactctccc 
aggccacatg 



gtccaaggat 
aaccctgcaa 
tgactagatg 
gaccaagacc 
agcccatgac 
gtagctgaga 
ccgtcagctt 
acacctccca 
tgtctatggg 
accccctttg 
aagtcttttc 
ccctgaaaat 
gtgatttcat 
gccgggacag 
caaatcacta 
tcactgccgc 
ttccagttct 
gcttcctttc 
ttgtactgtc 
gctgctgtgg 
gcaatttgta 
ggcttcagct 
gggaagagag 
tctcaaaaaa 
aggtggagtg 
aggagggcaa 
caagcaagaa 
agggtagccc 
caggttctac 
taaaatgggg 
gcacatctcc 
gacctgttga 
attttgagtt 
acatgtaagt 
agcagagtgc 
ggctgggggc 
ctctcagtct 
tgtcacatga 
catccttctt 
agcacaggag 
ttggtgccca 
aaatgtgtga 
gtcattggta 
gtaaagagag 
ttacagagct 
gggagtgatt 
catactgctt 
agctacttag 
gggtgctctg 
attccactgc 
tctctgtcaa 
gagagaggat 
cagaggctca 
cgcccattga 
aagcaggaaa 
agacagcctt 
tggtggtaca 



ggttctgagg 
cacacacccc 
tcaacagcga 
acttacaaaa 
catcacaatg 
aactacgtct 
ctgaaatcac 
atccttccta 
agccattctc 
cagggtagtc 
ctgatacaag 
gagattataa 
gtgtctgtgt 
gccctgtact 
accctgtgga 
tctgtcactt 
cagggatgag 
caccagctct 
cagtatttgt 
gcagagcagc 
ttctggtgca 
actcggaagc 
aatggacaga 
gctactggtt 
acaggatctt 
tgggaatgtt 
gttttaggct 
tgggtgaatg 
ctcctgccag 
atgaagacaa 
tcactcctct 
tcctacacct 
cgtggcagaa 
ctcgtgacgt 
tgtagccaga 
taagtttagg 
tgtttgtcct 
gggttttatc 
gtcagggtgc 
gtattagaag 
ttctggtcct 
tgcctggcgt 
caattatcct 
gaagtaggaa 
gagcttacat 
gatggtttga 
gtacagggca 
agcaaactga 
gctactggga 
gtgcccatgc 
attaaaataa 
gtggatagat 
tgtttcctat 
tcctgggtaa 
tacaaatgag 
gtactgtact 
gccttttaat 



acaagttcta 
atgttctggt 
ctgtcttagt 
taaagtgttt 
gggagcatgg 
tggttcacaa 
aaagccccgc 
aacagttcca 
attcaaacca 
cctgtccatg 
ggcactcctt 
atttgtcaca 
gactgtcctc 
cacggtattg 
gctgaagtat 
ggcagctggg 
aaagattgag 
cctcatggac 
acacacagca 
atccagagaa 
ttgagggtga 
tttgaagaaa 
ggaggctagc 
ggacaggtgc 
aaagaacaga 
gaatgaccta 
agaccttggc 
ctcatctcca 
ctgtgtaacc 
cagatgtcag 
aagttgagtg 
cagcactggg 
gccaggacag 
ggaagccctt 
tttccaaaag 
gcaatgggaa 
tggcaataaa 
tccagcttag 
tatattttgg 
tcaagaactg 
tggttgcatg 
ggagagagtg 
tccatgtttc 
agacagcctc 
catcaggaca 
gcgggaaggc 
aataagtcca 
actcccgaga 
tggctgtgtg 
ttcagagcct 
attaataata 
attttggggt 
gctagggttg 
ctgcattcct 
cacatgttaa 
ctcccagatg 
cctagcacac 



cagccacaaa 
tcctgtgtga 
tacttttcta 
gatgtgggac 
taccaggcag 
gcataaagca 
ccccagtgcc 
gcagttggaa 
cctcagtgat 
ggtgctcccc 
aggggcctcc 
actgccccat 
ttccctgcaa 
ttggcagagc 
atgaagcctg 
ggaccttgac 
tgaacaacat 
accgttgagc 
ggaaagttcc 
gcaagccacc 
ataattctca 
tcaggacagc 
ccttttaagc 
ctgagggaag 
aatagatgtg 
acctcctgaa 
gctgttgaat 
tatggggttg 
tagcctctct 
gtggagattg 
agagaatgcc 
gaagaaaagg 
attaccaaaa 
acagggaaag 
gtgggaagct 
gagcttggcc 
tgccttccct 
aggaagacca 
ggtactgggt 
agtttcccag 
ttgctacaag 
tttggtgcaa 
tgtgaggaca 
tggactctgg 
gggctctctc 
tatgccagaa 
gagattcatt 
agctctgaga 
atgcactgcc 
cctgtgtgcc 
aaggaacaat 
tttttttctg 
aattcgctct 
tacataatct 
ataaaatcat 
cttttaaaag 
agaaggcaga 



aaaagtaagc 
gagctcttat 
ttgctatgac 
tcacagctcc 
gggagtgtgg 
gagagggcaa 
acatctcccc 
accaaacatt 
tgtcagcaaa 
aagccgctcc 
cctgcatact 
agcagtcctg 
gcatgcagtt 
tgctgggtta 
cgtctaaatg 
ccgtctctct 
gagggacagc 
cactgcagct 
ccatgacagc 
catatggtcc 
gctaaatcta 
atgataggta 
aatgagcttg 
agtctcccag 
gggacagtgt 
gaggactttg 
tatggatgca 
gacaagccca 
gcgcctccag 
catgcttatg 
caggcagggg 
gctctttttt 
gaaaggagaa 
caggaccagg 
agaaagtgat 
aggcccgtaa 
tcgggtacag 
gaaaatcctt 
cctcagctcc 
ctcagtcact 
gcttaaaaga 
ttggccactg 
gaatggaatg 
agtagcaagc 
tgccttactg 
gccgttggct 
gcagcagggc 
gctagggttt 
cgtgctacat 
atggtgttta 
gttgagaggg 
ttggttgttt 
gcagctaagg 
agatctcatt 
atgttgtact 
taacaatgct 
acaggctggt 



63360 
63420 
63480 
63540 
63600 
63660 
63720 
63780 
63840 
63900 
63960 
64020 
64080 
64140 
64200 
64260 
64320 
64380 
64440. 
64500 
64560 
64620 
64680 
64740 
64800 
64860 
64920 
64980 
65040 
65100 
65160 
65220 
65280 
65340 
65400 
65460 
65520 
65580 
65640 
65700 
65760 
65820 
65880 
65940 
66000 
66060 
66120 
66180 
66240 
66300 
66360 
66420 
66480 
66540 
66600 
66660 
66720 
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ctctgtgagt 
taataaagac 
gggaagaagg 
aacaccaagg 
tgtgtgcaca 
aatgaccaga 
atgggagctg 
agcctcccag 
ggaccaatgc 
gttgcaaatc 
tatgtggctt 
tcttatgagc 
aaacacacga 
aagctggaga 
tggttccccc 
cacatgccct 
gcaaacagct 
agctcttggg 
agagaattca 
aatagaattt 
acagaagagg 
tttgaactca 
agccccaaaa 
aaaaaaaaag 
ggctacatac 
attttaaagg 
cttgtactgc 
cttagaaggt 
ggcagctttc 
acaagtctac 
ttagttttgt 
caaggaactc 
cagtaggcct 
aaaatgacaa 
cagagtgaag 
tgccagtgat 
gaactgacct 
tattactcag 
gtgagggggt 
tgatcaaaga 
aagtcaaaca 
aggtgtaaac 
aaaggagtag 
gacctagagg 
tggggaatga 
cttagcatgc 
gttataaacc 
catactcctc 
ctgtcatctc 
ttaggaggtg 
aataccctac 
gtaaaactca 
gataatagac 
agttgccttg 
tatgtgctcc 
taagaaactt 
gggctcatgg 



tcaaggccag 
cctgtcttaa 
aaaaagaaaa 
aattttgcac 
tgtgtgtatg 
agaggctgtg 
gcaaccaaac 
cccacactcc 
tgagtgtttc 
taagctgtcc 
ttttatgaaa 
ttgcacccat 
tgggatgagt 
gacggcccgg 
cctcaagaca 
cctctgacct 
gtacacataa 
agacagaggc 
aagacaggaa 
atttttaaga 
gcatcagatc 
ggacctcggg 
tagagttttt 
agttctgtac 
aaccatgttt 
actggagata 
cgaaataaaa 
gggtttggtt 
ccacaagagt 
cagggacttt 
aatcacagta 
actcctcatc 
agctccaggg 
atgggatttt 
agacagaatg 
ggcggcacac 
tggtgcccaa 
ccataaagaa 
acaattaggg 
gtgaatgaag 
tgttttagaa 
aaagagagag 
atgccagggt 
ctgtgggaaa 
gaggggagtc 
agtgtgaccc 
caggaaaggc 
cctctaactc 
actgagtccc 
tggctttgtt 
tcttagctgc 
ctccttctgc 
gtaacctctg 
gtcatggtgt 
tgggtatgag 
gactttctct 
gcccaactca 



tgtgatctac 
agaaacgggg 
atagtacaaa 
ttttcttttt 
cacttgcctg 
gatcccctag 
tggtgtcctc 
aagtgagggt 
agacccacaa 
tctcttcagc 
cctatcttta 
gaaggaagag 
tacacagata 
cagctaaggg 
tgacagcttc 
ccatgggctc 
aataaagatt 
agggggatct 

tatgtagaga 
tttattatat 
tcattacaga 
taagagcagt 
aatttaaaat 
aagggttaaa 
caaaaatcca 
catctctgtt 
taaaaacaac 
tgggtttttg 
catatggagg 
gaggagggtg 
atcaaaccca 
ctcctgcctc 
aagggcttct 
ataaaattaa 
ggagaaagtc 
ccgcgtttat 
cattgaagga 
caacacgata 
aagaaaggga 
aggttacaat 
aataattctt 
ggagacagaa 
gctgaaggag 
gcacttactc 
tgaatggagg 
tccgtaccac 
caatataata 
ctctcagacc 
tgtgatggtt 
ggagtctgta 
ctggaagtga 
accatgcctg 
aacctgaaag 
ctgctcatag 
accatcccct 
ccccagcagt 
gggaaacgtt 



atagtgagcc 
aggaagggga 
ctgagtgtta 
ataaatttac 
tgtgtgctca 
agctggaatt 
tacaagagat 
acacagtgtt 
aagccagaaa 
tatgtatgtg 
attggaaaag 
tgggaccagg 
aagtgaaata 
cactgggtgt 
cagccatctg 
tgcgtgcatt 
tgaatcaaag 
ctgtgagttt 
gatgtatgta 
gtaagtacac 
tggttgtgag 
cagtgctctt 
atttttgaat 
ggcagggcat 
aaaagaaata 
gtccagtgct 
cagaaattaa 
agccagagtc 
tactcaatgc 
ctcaaagtgg 
aggacctatg 
tacctcccaa 
caaaacaaaa 
aaaactctga 
tttgccagtc 
cagagcgaca 
agaaaatgtt 
ctatttacca 
ccaggacaga 
gaatagatat 
ggaaagctac 
tgagagagag 
gaaggtttcc 
ataggacaga 
aaagcacata 
caaaaaagaa 
tgccttttct 
caccctcacc 
tgtatatgct 
ccctgtacca 
gtattctgct 
cctggctgct 
ctagccccag 
cagtaaaacc 
gggacctggc 
catctgccaa 
tttaagttac 



ccaggccagc 
ggtgggagag 
tggtacatgg 
attgtgtgtg 
cccatgtaaa 
acagttgtaa 
acatactctc 
tcttcttgtg 
tgtcatgatt 
tgtgtgtatg 
tctcatcttt 
caaaaatgag 
aggacatgag 
tctttcagaa 
taaccctagt 
cagtgcacag 
ccagaaatcc 
gaggacaggc 
tgtagtctca 
tgtaactgtc 
ccaccatgtg 
aaccactgag 
taataggaag 
caggaattca 
aacagatttt 
tatcaatcat 
aaacttgagg 
taatgctacc 
ccccacctct 
ttatttgttt 
catgtcaggc 
gtgctggtta 
caaaaacact 
ccagcaaggg 
acacactgac 
tttgttatag 
cctgaacatc 
aaaatggatg 
aacaaagagt 
gttaatatat 
tagagatgtt 
aagggtccaa 
aacaggcctc 
cagcaaagtg 
gcagggggtt 
aagaacatgg 
catactgccc 
ccctccctcc 
tgggccaggg 
tcacagtggg 
agcagccttc 
gccatgttcc 
ttaaatgttg 
ctaactaaga 
actacctgga 
tagttcctca 
ataggaaaaa 



cagtgctgca 
gagggaggga 
ctggaatcat 
tgtgtgtgta 
catacatgca 
gctgcttgac 
ctaaccactg 
tcacaggaaa 
cagcctcata 
tatgtatgta 
gtgtctttat 
tgagaagaca 
cctatgaaat 
gacccaggtt 
tctattggat 
atatgtgcag 
ctttaatccc 
tagtctatat 
aacaaaacaa 
ttcagacaca 
gttgctggga 
ccacctctcc 
agggaagaaa 
agccagccgt 
caaagtaaaa 
atgcgggccc 
gaaggaagct 
tcggttgcct 
gtcagtcatg 
gtttagctaa 
aagtggtagc 
tggcgtctgc 
ccaatagcaa 
aaacagttag 
aggggatcaa 
ccaagttatg 
atgggaggag 
ggactggggg 
aactgtggaa 
gctcatttaa 
accagaatga 
ggagagggac 
aaagtcagca 
cttagtcatg 
aggagggtag 
tcattgtaca 
atactcaccc 
tcccaacttc 
agtggcacta 
tgtgggcttt 
agatgaagat 
tgccttgggt 
tccttataag 
cagaaattgg 
gctacatcct 
gctaggggag 
gaacaaccag 



66780 
66840 
66900 
66960 
67020 
67080 
67140 
67200 
67260 
67320 
67380 
67440 
67500 
67560 
67620 
67680 
67740 
67800 
67860 
67920 
67980 
68040 
68100 
68160 
68220 
68280 
68340 
68400 
68460 
68520 
68580 
68640 
68700 
68760 
68820 
68880 
68940 
69000 
69060 
69120 
69180 
69240 
69300 
69360 
69420 
69480 
69540 
69600 
69660 
69720 
69780 
69840 
69900 
69960 
70020 
70080 
70140 
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gactggtctg 
gctaaaatac 
taagaaacat 
gatggcacaa 
caaccacatg 
tgacagctac 
aaattagttg 
caacaacaac 
taattatgta 
ctggagctgg 
ggatcctctg 
tttttgatct 
gcaagacaga 
acagtctcat 
aatgagcccc 
ggagcaatgg 
gtttttctga 
tggtggagac 
taagtgacaa 
gtgctaccag 
ggaagctgaa 
tagttccagc 
ctcaggccca 
gatttaggag 
cttctgtgtg 
gcttcatcca 
caaatgggct 
ggatagaaga 
ccacttagga 
gtgctgggct 
ataggggaga 
tgcacagaga 
ggaggggggg 
ggacccacag 
tagcttatca 
tagatgaagg 
ttccaaagtg 
ggccagagat 
gctaaactct 
tatgtgaaca 
ggaggcacct 
tcctggcatg 
cacgggtcag 
acccccacca 
cagtggcagg 
cacatggtta 
cccattgctg 
gcaggtttca 
cactgtgata 
tgagggacac 
ctgtgaacag 
cttacaggtt 
aggcagacat 
gaagactggt 
cttcctccaa 
ctgagtcaag 
ccctattgac 



tcaggagcct 
aacaataaac 
gagataaagt 
cagttaagag 
gtggctcaca 
agtgtattca 
ccttcagggg 
aaatcttaaa 
tgagtgtatg 
agttccaggt 
aaagagcagt 
aaaaaaaaag 
gaaaaggtag 
ggtgtgtgga 
cagggcttag 
acagtggaca 
agtcctgcag 
ggagaaggag 
tagacatcta 
ggctggcgat 
catagctagg 
gtcccaggat 
gtgctgagct 
ccctttgggt 
acttctgttt 
atccagatgt 
gattaccaca 
acagatagtg 
aggtgaaaca 
tcagggaaga 
ttgaagttgg 
agtcggtaat 
ctgtgtgtgt 
gctgatgact 
acaaaggggt 
ccttccccat 
ttattggttg 
taaatacctt 
cagggcctct 
acatggggaa 
accatcttaa 
gtgaactgtc 
acattggaga 
ggttcccaag 
catgagccga 
ggcagcattc 
caagctcagt 
ataaagaagc 
tttgcaatga 
caggcatccg 
acacaatgac 
cagaggttca 
ggtgcaggag 
tttcaggctg 
taaggccaca 
cattcatagg 
gacagccctc 



tgtcgtggtc 
caggtggtca 
taaaagagtt 
cactgactgc 
accatccata 
tatacataaa 
gcaggacatt 
gaataatttg 
cctgggtatg 
ggttgtgagc 
acatgttcct 
ttgaaagcta 
ggaggtgaag 
gcaaagctaa 
cccaggtatg 
tagcggataa 
ctcattaagt 
aaaatcctga 
atgtgaggtg 
cattccctcc 
taagaaccag 
cacaagtact 
ggccaggatc 
aagctctgga 
ttaaactctg 
gtggtctctg 
cagaggcctt 
tgggtccaca 
caggagtttg 
accaagacac 
ggtttcctga 
gaagtgtcag 
gtggctgagt 
gggactagcc 
gcagatggac 
gctccccatg 
ttcatcatgg 
cttcaagaag 
aggtacctac 
tatggtatgt 
gtgtgagttt 
ccacaagata 
caagctaagg 
gcatctactt 
cgggaacagt 
taggcttcag 
ggggttgaga 
tcttgttatg 
tgaggttgaa 
gcctttcata 
caaggcaact 
gtccattatc 
gagctgagag 
ctagaatgaa 
cctatcccaa 
actgcttgaa 
tagactaaga 



acagtgacat 

tgcatgcctt 

cattgaaaaa 

tcttccagag 

atgaggtctg 

ataaatatat 

ggagggagca 

attcattttt. 

tgaggtgctc 

tgcctgatgt 

agctgctcag 

caaattttct 

tggcatggaa 

ctctagctgg 

tgggggccag 

agaaacaaat 

cccagaagtt 

ggaaggaata 

aaaaggatca 

ctgggttcta 

gctctgggcc 

atcttccagg 

ctagccttta 

tttaagacct 

ccctgctctg 

catcttgatt 

gctgttcacg 

tgctgcccca 

cctgtgctta 

cattctgggg 

cttccgggcc 

ggattcatgg 

tgcaaggaat 

cagggggcca 

tttctatggg 

gcagttctca 

aaaatggctg 

gaatgtcggt 

ctcattagta 

gttccaggcc 

ctggagtctt 

gggcacatag 

gaaggctccc 

tcactcatcc 

agcaagagag 

gagcataggt 

cactgaccct 

tcatgatgag 

tcatatgaaa 

ttggcacttg 

cttgtaagga 

atcaaggcag 

ttctacacct 

ggtcttaagg 

caaagccaca 

attttgtgtg 

gagagacgag 



aagcattgag 
ttatttatgt 
cttgaaatta 
gtcctgagtt 
acaccctctt 
cttttaaaaa 
gagtggaaag 
aattttatcc 
caagagacca 
aggtgctagg 
ccgcctctcc 
gtgaattttt 
ggagaaggag 
ttctgggtac 
attggggctt 
gcatgaatcc 
tctaaacaag 
tgtttttgct 
tgggccttga 
cagcaacctt 
agctgggact 
aacaaggaga 
ttaacagtgt 
gaaccagtgc 
ctaaagtaac 
gctcacctca 
ctcagtgagg 
cctctggggt 
tcccatccta 
gtaggaagat 
tccagttgct 
gcgacacagc 
tgagagcaga 
ggggaatggg 
agacttaagc 
atgaaagaga 
cttaccaact 
tcaagaaggg 
tagagaattc 
aggaatctgg 
ctgttctcca 
cttagcatgt 
agtacagtct 
tgggagggaa 
tacagaaagt 
cccaaactgg 
tgagccatcg 
ctggtatctt 
ttgctggctt 
tcatagttag 
caacatttaa 
gaacatggta 
tcatctgaag 
ctcacactca 
cctccaaata 
ttctcattta 
tctcatgctc 



tgtttatgta 
ttctattttt 
gggctggaga 
caattcccaa 
ctggtgtgtc 
aaaaaccttg 
atactattgc 
ttattatttt 
aaggagtcgc 
aattgaactt 
agcccctcaa 
cacaaattaa 
aaaaggatgg 
cacactttaa 
gggtgcctcg 
ataaacttct 
ttggtgattt 
gactctaagg 
ggaccgtgct 
ttcattcctg 
tgcagagttt 
cactccttgt 
ctcttggcga 
tcccctgtca 
ccctggatgt 
gaagtcccct 
cgctaaaagt 
gtgggacatg 
ctgccatcag 
gcaatctaag 
tctggagaac 
gtgataaggg 
gttcagggag 
aactctgggt 
ttcggctcta 
cagtccttgc 
tctctgggtg 
acactttgtg 
tgttttgggc 
cagggagcag 
ttaccaagtt 
gtggctgact 
catagaggta 
tatgaaggta 
agagaatgtt 
ctatttctgt 
ctggttaaag 
tagagtgggc 
tgaaaccagg 
ggttttactt 
ttggggctgg 
gtgtctaagc 
gccactagga 
cagtgacaca 
gtgctattcc 
aagtcagcca 
cgctgtgagc 



70200 
70260 
70320 
70380 
70440 
70500 
70560 
70620 
70680 
70740 
70800 
70860 
70920 
70980 
71040 
71100 
71160 
71220 
71280 
71340 
71400 
71460 
71520 
71580 
71640 
71700 
71760 
71820 
71880 
71940 
72000 
72060 
72120 
72180 
72240 
72300 
72360 
72420 
72480 
72540 
72600 
72660 
72720 
72780 
72840 
72900 
72960 
73020 
73080 
73140 
73200 
73260 
73320 
73380 
73440 
73500 
73560 
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ataggagaaa tacatgtaag agatcaggga ggcccttttc ctaacatgcc ctttggcctt 73620 

tgcttagttt tatttaaaaa aattatttta tttggaggtg gatgcatgtg cctcagtgca 73680 

tgtgtggagg tcagaggaca acctgcagga gttacttccc tctctaccat atgagtccca 73740 

aggattgaac tccattgcat agtgtatgtc ttcactgcgg agccattgtg ctggccccgt 73800 

ttctctacat tttctctcac caaataagta attcagtctt acagaaataa aataaaaacg 73860 

ttgcaaccaa aatagttaaa agtttaaaag aaaagcagca cggccttcac tgctaaccgg 73 92 0 

ttctggactt ccccttccgg tcggcttaag ggattattgt cgtcatcgtc gtcatcgtcg 73 980 

ttgtttgttt ttatggtttg ccttggtttg ggctttcgtg agacaggatt tttgcaatat 74040 

tgttctgtct agcctggtgc tcactattta ccacagattg gccttgaact cacggcagtc 74100 

ttcttacctc agccttccaa gtgctggaat tacatggcca tgccccacac tttttttttt 74160 

taaagattta tttattaatt atatgtaagt acattgtagc tgtcttcaga cactccagaa 74220 

gagggagtca gatctcgtta cggatggttg tgagccacca tgtggttgct gggatttgaa 74280 

ctccggacct ttggaagagc agtcgggtgc tcttacccac tgagccatct caccagccct 74 34 0 

tttttttttt tttattgcaa tggtgataca cattccattt tgccaagtgg cattttataa 74400 

aaattatgta ttgtggccac ctttctgtgg ataaagtaat tctacaccat aatgttaaaa 74460 

ctctgtgtgt gtgtgtatat atatatacat acatatattt gtgagtccct tacctgtgaa 74520 

catgctccag tgctctaatg tatcttggtg cagttcctca ttttttgctt caaatcataa 74580 

ccttgcagtg gaacagctgg gtccaagaag ccatatactc atagctttga atgcatctct 74640 

ccgggtgggt gcgtctctca tttatacaca cacacacaca cacacacaca cacacacaca 74700 

cacacacaca cacacacaca aagacctttc agtatttttt taagctgacc acatctttgt 74760 

agaggccagt gtgcagggag gtggaggttc tgctaagtgt gtctgtcctt cctccctcag 74820 

aaaagagaag gcttctgtca actcctgcag cagatgaaga acaagcattc ggagcagcca 74880 

gagcctgaca tgatcaccat cttcattggc acttggaaca tgggtgggtc cctgtgcccc 74940 

ctccatccta ccagctctac ttgggtccac cttcctgcct cagcttctac aagtggcaca 75000 

agggggcacc tctatctctc agccatagtc ctggtctcaa tcccatctat acaacctctt 75060 

ttctagagaa ttctaccttt gtcagaggat gatgaacaag taaaataggc tttaggatta 75120 

gtgcacccca aaagtagcca aagtagctta ctctgcagtt taccaagggg ccccaggcct 75180 

ggaataaaga agtctggtcc ttgcttatct agggaggaag tgaggggagg gatagagaca 75240 

tcagtggatg agtagatgga tggatgggta agtggattga tggatagatc ctagtggccc 75300 

taagtttaag tgccatataa ataatatata aggttttttt taaagattta tttattatat 75360 

gtaagtacac tgtagctgtc ttcatacaca ccagaagagg gaggcagatc ttgttatgga 7542 0 

tggttgtgag ccaccatgtg gttgctggga tttgaactct ggaccttcgg aagagcagtc 75480 

gggtgctctt accaatgagc catctcgcca gcccaatata taagttttta gcatatatgt 75540 

gtagttaaaa gtaaatgaac aaaatatgta tgctaaatat aatggtagca taatcatatg 75600 

gtaaatatgt aagtgaaatg atcatttggt ataagtactc aacaataaaa gctaaggtac 75660 

aaaatatata ttggattaaa tatatatcat atataacaat taccacatat tattatacta 75720 

aatatatata atagtacatt atactaattg tcaattatac taaatgtttt gggatatata 75780 

ttgagctaga attaaatttt taaaatgtcc agttaaatct gaatctcaga caaccattgt 75840 

atggggcata cttactctaa aaatcaccca gccaccagaa attcaaattt aatggatatc 75900 

ttagtctatc attaactcct caggcactca ccctcaggtg ttcaggcatg tggcacaagg 75960 

acctttctag ttctcttgtc cgtctgtgaa gagttctttt ggttgaacat ggcgggaggt 76020 

catcccattg ctgcccttcc agcagcaaac acatgggggc gcccttccag gagaagtcac 76080 

atggaggcac taggcaggag ccctgctgcg catggcaggt ctccttatga gtttcactta 76140 

gagccccccg gggggggggg gggagtttgg cagagatcat tatgtggatt attggcttct 76200 

tgtctctctg ttggctgagg gacatagata gatctcctca gggtgtgggg agaatatcta 76260 

cccatactta gggtagctcc agaagtagat tacccatcat tccttgaggg agctgcttcc 76320 

actcattttt caaaacggaa gctgagacaa gttgcagccc cctcctgttt ctctgttgag 76380 

cacctcttca tctcagttct tagctgactt ataactttcc tccaggaagc ctttggtgtt 76440 

tcctggctgc acggtagaag cctccatccc ggacatgtca ctcacccctg aaaatgtaaa 76500 

ctcctgagtg acaaacgggg ccagatttca tgtctcactg ccaccctaga gcctaggaca 76560 

tgtctggcaa cctgttcagt tagatcaggg ctccagagag gatctcccag gctttccatc 7662 0 

ctttgaaatg taatgttatc tttacaacaa gatctcactg agctctgccc atagctaaca 76680 

tgtaaggtgg tatcagcact gagtcctcca tctagggcat ttgcccccat gatttttgta 76740 

actatgtgat tctcctcttc ccctgttttg aatcccccta atgcccttga cctgacctgt 76800 

ctttgactca ttcggtgttg ggttttgctg ttgaaggtaa tgcaccccct cccaagaaga 76860 

tcacgtcctg gtttctctcc aaggggcagg gaaagacacg ggacgactct gctgactaca 76920 

tcccccatga catctatgtg attggcaccc aggaggatcc ccttggagag aaggagtggc 76980 
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tggagctact caggcactcc ctgcaagaag tcaccagcat gacatttaaa acagtgagca 77040 

gctggccagg cctggggtgg gaagacagca gactctttca agcattccag aagtcagaca 77100 

ggatacttcc aaagatgtat aggattgctc aggggtaccc cactttcaga gccacagatg 77160 

tgcattgagg tggcaccctt acaagttgat agggtcctga gtccgccatc ttccctactc 77220 

ctgcttaaaa gaataatatc gccgggcgtg gtggtgcacg cctttaatcc cagcactcgg 77280 

gaggcagagg caggcggatt tctgagtttg aggccagcct ggtctacaaa gtgaattcca 77340 

gggcagccag ggctatacag agaaaaaacc aaaaagaaaa gaaaaagaat aatatttact 77400 

tcctagatgc attttcagtc ccagttctca tctctgaggt gctttgtctc atttctaggc 77460 

attgttggaa gttccccctg aaagctagga aatacagaca gggtgtctta ctcccagggt 77520 

ggaaccgggg tgcatgttca gggttccaaa ggtgtacctc agtcctgtgt tgcaacacca 77580 

tctcccacca ccaccaccag gttgccatcc acaccctctg gaacattcgc atagtggtgc 77640 

ttgccaagcc agagcatgag aatcggatca gccatatctg cactgacaac gtgaagacag 77700 

gcatcgccaa caccctgggt gagcatagag ggaaagccat tcctgtgcat gctcctcctc 77760 

ctcctccttt tggcaatgtc ccaggataaa ctgtgagagt cctgtcctgg gatcccttcg 77820 

tctctagcaa atccagaagg ttttccttgc aaaaactcat ccagggtcta taccagctct 77880 

ggcaatctgg ctaaaatgtg ggttctgtct cagtaagtct cagatggttt ccattctgac 77940 

aagctttcag gtgacatagt gccagccagt ccagggatcc cacttggaga agtgtgtgta 78000 

cttgtgtgtg tccgtatttt ggggtgtgta tatgtgggtg tttatgtgcg tctgtgtctg 78060 

tgtatgtcnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 78120 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnat gtgtgtgttt 78180 

atgtgcgttt gtgtctgtgt atgtctctct gtgtgtgtct atgtgggtgt acatgtacat 78240 

gtgtgtatct ggggtgaaca tttttgatcc cacaggagtg gatcctgagt agatagtcat 78300 

tcactcagtt gggggaggag gcacctccca aatcccaagg ctccaggctg ctccacccct 78360 

ccttgtccct ctctgctcag tatcatggcc cccaacatcc ccatagccag aaacagatga 78420 

tagcattctc cttcctttct cctatgtaga aaatcacaag gctgctacca cctgccagtc 78480 

actggcatgt cccacccccg ctgacctatc cactctcctc tcttcccttg gtctgcttct 78540 

ctttctcctt gctaacagca ttagtaccct ttacctcacc tcagcctctg acccctcggc 78600 

aagcctgcct .gcctgcctga cattctgctt cttctctttc ttcctccctc aggaaacaag 78660 

ggagcagtgg gagtgtcctt catgttcaat ggaacctcct tggggttcgt caacagccac 78720 

ttgacttctg gaagtgaaaa aaagctcagg taatgggagc cattccctcc atgcacccag 78780 

acagccttag cctcgcacac cctgatgctt gcccagcctc aggaccttct ggacatctgc 78840 

cgtctgagca aagacaactg tagtacagta gagacctcag gtatatgtga ctcttgtctg 78900 

gaggacagaa aacatgttga tgtcattatc aacctagtga acccctttga gcctggttca 78960 

tttctaaagg gaaataccac tcaggagcat cctatagctc aaagctcact ttgtctgcat 79020 

gtgcctctgt catgaccata accagctatt tcaattgctg atctgtttcc taatgggtag 79080 

attcctgcat ggaatctaga ctctgtgttc ccagttgggt ttcccagaca cactcaggtg 7 914 0 

accactggga cagagcttga gtctaagtct ctgtttccca cctgttgcct ctgacttcct 79200 

actgtcagat caaggacagt acccatgcta gatgtgctag atggcccttg ttttcttgaa 79260 

ttaggagaaa tcaaaactat atgaacatcc tgcggttcct ggccctggga gacaagaagc 79320 

taagcccatt taacatcacc caccgcttca cccacctctt ctggcttggg gatctcaact 79380 

accgcgtgga gctgcccact tgggtaagga gactccacct tgggtcgagc cataggggac 7 9440 

agaggctctc aggagtcggc tgtggacaga aaatcaaaga gacatgtgaa cactcagaac 7 9500 

caggctaagg tagcagctct tcgcaggatc tcaatacagc tcctgtcatg gatcctctgc 79560 

ctttcttcat agttcccttc tgcttgtgcc tcctttgccc caggacttct atggctgccc 79620 

cttgtcactg tcactgtctt tccagtcagc cagccagcca gcagcagctg ggcttggtca 79680 

ggaagcacag acatcttagg gggccagctc acttatctca acaaaccatg gggtcactgg 79740 

tggatccaag gcattgcttg cctgcctaca ggaaagggac caacagatgt gaccacaggg 79800 

atgagtctag aacccacaca gttgttttgc atgctcataa aagttgaagc aggagacagc 79860 

gtggggtgga acagtgagtg cccttctgtc tagaaatcag agttagcagc ctgcttccct 79920 

cccacgcagg gactttaatg tattcccaca ctcgagtcct gggagggaca aaggaagcaa 79980 

agcaggagct gaccctgttt tatagtcaaa cacacagaga ggctagggaa gcggctgaga 80040 

ggagattgca catcattcat agctgaggtt ggacctcagt tgtcccttat cccaagcatc 80100 

tctccatatg acccatgtcc cttggtctcc tttgggcttc cagccaccat cacagctctg 80160 

cagccagctc accatgcctt cctcactagg gacccccagg ggacatcttt cctccctggg 80220 

aatgaacagg caggtaggag catgagaaga gtttccatcc taagcctcct gaggaaggcc 80280 

tgccacagct ccacgctcac cagggcagaa gcccagagcc tggctcaaag ccacagagaa 80340 

gtagacagaa gagagaactg ctggagccca aggccagtga attccccagt tctgcactgt 80400 
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ggcaccaatt cttgtaatta ggctgagtgg ttacaaagag ggtgatgctc agaagcctca 80460 

gagccaggga ttccccacca ccactagtgg acttaacact agatgccttt gttcccacag 80520 

aaagagccag agaaagtaat ggagaaatac tacctgtctc atgaagtatt gctaacaatg 80580 

gtgtttaaaa gctgatgtat aaaaacaaca gatctacatg caaatttggg ctagtctgta 80640 

gctaattatg gtgatggtgc agcccttcta ggctttcctt atccttataa agtgattgcg 80700 

agtcaggaat cagcatccca gacggatgtg gtcattggga agtgtggtga agaacacgat 80760 

ctaacctggc atgtccccgt tccaggaggc agaggccatc atccagaaga tcaagcaaca 80820 

gcagtattca gaccttctgg cccacgacca actgctcctg gagaggaagg accagaaggt 80880 

cttcctgcac tttggtgagg gcacactgtc tctcctttgc atcttttcct tccatcttct 80940 

ctctacctgg acatcctgac caaggacagg gctgtgtctg cagaggaggg aagctggagg 81000 

tgctagggaa gccaaactga tggccatgcg tctgtttcag aggaggaaga gatcaccttc 81060 

gcccccacct atcgatttga aagactgacc cgggacaagt atgcatacac gaagcagaaa 81120 

gcaacagggg tgagtcctcc cagaagccac tctcctgccc tgtcccacct cctttaccca 81180 

actcactatt ccatggtggt tcctagaaag tgggaagtat cctacctcac cagacaacag 81240 

caaaaacaaa agtcagaatg cacatacagg ggccagggat ggaactcagt catagagtgt 81300 

ttgcttagtg tgcacaaagc cctgggttcc atctcagcat caagtccagc atggcaccat 81360 

ctatctttga tcccaatact caggaactag aagtaggaag caagaagatc aggggttcaa 81420 

ggtcatcctt gactacaatg aatttgaggc cagtctgagc tgcacgagac tttgtcctcc 81480 

caccccccaa aagaaaactt tcacatgctg acttatatct tggtacaaaa ctgccaccaa 81540 

cttgtacatt aaaaaataat ccaaaaagct gaattaacaa tgaccctatt ctaagatgct 81600 

gagagaattt taaagcattg tttcttttgt tttgttttgt tttgttttgt tttgttttgt 81660 

tttgttttgt tttgtttgaa acagggtctc actatgaaaa tcctggctgt cctggaactt 81720 

actatgtaga ccaggctggc ctcaaactca cagagatcca tttgcctctg ccttctgagt 81780 

gctggcatta aaggcatgca ctactatgcc tggttactag taaggataat tttatgtgta 81840 

ttagtggttt tgcctacatg agcatctggt gcccacagaa gctgaaaggg ggtgtcagat 81900 

ccccagaaac agaatcnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 81960 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnntgca 82 020 

agacaaccag agatttaagt actgagccct ctctaaaacc cttttaaaag acaaaaacaa 82080 

agatacacac acacacacac acacacacac acacacacac acacacacac accccacaca 82140 

cacagttctg ttagctgaaa ctaaacaggt tctcatgcat ctgagtgtgg tggtgcatac 82200 

ttgtaatccc aatacctggg aggcttaagc aggaagatca taggttcaag gccagcctgg 82260 

actacatggc tagactccaa aacaccacca gcaaaactga tttctttgct tcaggactta 82320 

gaatttcaaa acattgtagt tctccattga taaagccaca ggccaaagta aaccggtcac 82380 

agagtgactg ttaatatgtt aataaggcca ctcatgttga aaggaataat gtttggggaa 8244 0 

agatgcaata aaatactctt gcagaggacc tgggttcagt tcccagcacc aacatggctg 82500 

ctcacaacca actgtaaccc caactccagg ggtctaacac tctcttctga cctttgtggg 82560 

cactgcacac atgtgttgca catgcatata tgcaaacaaa gcactcacac acataaaata 82620 

aataaatata ttttaaatag tttagcacta tctgagggta aattccctaa agaaccaatg 82680 

acagctgaac ttgatgatgg ccgatgtcct cattataggg tggttttgtt tttaataata 82740 

acagagaaag cagcacatgt gctataaaac tgtcatgtta cattgcagat gaagtacaac 82800 

ttgccgtcct ggtgcgaccg agtcctctgg aagtcttacc cgctggtgca tgtggtctgt 82860 

cagtcctatg gtgagtggaa cacggtgggg tgcaggctag gttttgggtt ctgaggacag 82 92 0 

tagcaagcca aggggcttca gtctgcttct ccataagatg agcgagtgtc ctgaaagagt 82980 

ttgtgcatct ctatccccct tgggctgcag tgtaataaat cccgctcaga gagagacact 83040 

ttaactagaa aggacttttt gttgttttct ttttcttttt caagatttat ttattttata 83100 

tatatatata tatatatata tatatatata tatatatata tatgtatata catatgaggg 83160 

catcggatcc catgacagat ggttgtgagc caccatgtgg ttgctgggaa ttgaactcag 83220 

gaccttggac ctcaggaccc ctaatatcat aaacaggacc ttctggtact attcttttta 83280 

agagaaccag actgagtttg agccatataa aagtgatatt ttatcaggta taaaacaagc 83340 

aactaccagg tattcacatg gttcagccta atgcatatca ttaagtatgc ccatgcagat 83400 

ttggagaggc ctaagaatat tttattgtgg ggctggagat atggctcagc agttaagagc 83460 

acttgctgat ctcccagagg acctcaggtt ggttcccagg acactcactg tgtggtccac 83520 

aacctcgaac ttcagctcca gaggatccaa ctgcctcttc tagatcagga gcatctacac 83580 

acatgcacat gcacacgcac acacacacac atactttaaa agacaaaagg aaatcttaaa 83640 

acacacacac acacacacac acacacacac acacacaaat gtccaggcca tggcactcaa 83700 

ccctcatccc accacaaacc acaggtgagc aagtcataag ccagagacct ctgataggca 83 760 

gctcatgacc catgctgtcc agctgagccc tcatgcctac ccatccctga gagtagagaa 83820 
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actggttggg aggatttggc cactgactag catccacctt atggtcccct ggagcccatt 83880 

taactccgat cccacagatg aaggtggttt ctgtcccctt catctgtatg tgtctgctgt 83 94 0 

tgccctggga tactttatca catgcattac aaccccctcc ccctgctctt gagctctgtc 84000 

tcccagaagt ggcctcacta gataaacttc atatagcacc tgtagaggtg tgacacagaa 84060 

gcatctctta cactcacagc cttggccaag gcactcttcc gccttcagaa cctgaaagtg 84120 

ggacaggagc tatatattgc tcaggggtag agaattttcc tgccatgcat aaggtcctgg 84180 

cttccattcc tagtacttcc caaaccaaac aaacaaaaac gccagaaaga tggtccatct 84240 

cccatccttt accaggactc cttcctcagt gtttcctgct tgtctgcccc caccaacacc 84300 

tgatctctcc agccctgtga ccctctttct agcactacac agagtcagtc atatggaaac 84360 

ctactcccca aatcctgcag ccctgagggc cctctcgatt tctgactcaa aaggtcctgc 84420 

caactatctc catcagtcag cgacctctgc cgagggccaa gtgataaagg ctcatagaat 844 80 

tgccactatc tctttagaca ccagccttgg ccttcagaac tgctccatag cacaggtttt -84540 

aagaaagggg aatgcgatgg atcaggaagt ggcaactaag attgaggaga gtaaacatca 84600 

cgaacattcc cagcatgtga gaggcaaaaa ggaaaaccaa ccttatatgt aaggctgtgc 84660 

catcttccac aggaaggctg aagatgtata tattttgcta atattctttt gagagccaga 84720 

ggacgcagta gaagggttta aggagttggc catagaggaa gccaagaagt tggggaaaga 84780 

aaaaagaatt catatctaga atgcaagaaa gggggccctg ggacgggcgg tggtggcgca 8484 0 

cgcctttaat cccagcactt gggaggcaga gacaggcaaa tttctgagtt tgaggccagc 84900 

ctggtctaca aagtgagttc caggacagcc aagactatac agagaaaccc tgtctcggaa .84960 

aaacaaaaca aaacaaaaca aaacaaaaca aaacaaaaca aaaacaaaaa caaaaacaaa 8502 0 

aacaaaaaca agaaaggggg ccctgggttt tcttccaact ggttcataat cgggtgtatc . 85080 

aactatctat cgctgcatag caaatgccct aaatcccagt tcttaaaaca ctgagtcctg 85140 

ggctggagaa atggctcaga ggttaagagc ctgagttcaa ttcccagcaa tcacatcaca 85200 

tggtggctca caaccatctg taatgggatc tgatgccctc ttctggtgtg tctgaagaca 85260 

actacagtgt actcatatgc ataaaaataa ataaacaaat ctaaaacaac aacaaaccac 85320 

tgagtccttt gctcatgatc ctttggctgc actgaattca gacagctcat ttcttctctt 85380 

tggtgtcagg tggacacact gaagtatttg atcagatctg gggttggctg actgtcagca 85440 

agggtgatgg cgatgcccag gaggcaagtt caggcacctt tgcttcttgg actcagggtc 85500 

cccagactag gtctcagctt gtccctacag tgagtttgtc atccacacct tggtcaaaag 85560 

gtctcctcta tccctactta gaaatgcctt ctctcccttg caggcagtac cagtgacatc 85620 

atgacgagtg accacagccc tgtctttgcc acgtttgaag caggagtcac atctcaattc 85680 

gtctccaaga atggtaagca atgggcaacg tcagcttttc ttgttttcct caaagacaag 8574 0 

gggcctaggg catttgtcat ctggttgcag ctaccaattg tctgggttag atgtaggcct 85800 

atcctttcct tcccaaggcc catggtctgc cctacctgat ctcttcatgt tcaggccaca 85860 

tcacaatctt agaccagaaa accatatata tatagtgttt taagtaagtt tcttctggat 85920 

aagcaagtgc ttcaagtttc ttccaattgt ggtccaattt atttaattcc atctaagttt 85980 

aaaaacggac tctaattaaa gaaaatatta gtccagtgtt agtgtgggtg catagtaggt 86040 

tgagatggtg aaagctatgt cggtgggatg tggatgaagg aggggaggga aactggaata 86100 

aactcatgct ttgagccaca ggtatggttg caaattgccc agcccagccc tggttgctga 86160 

gctacccttt ctgcagcaat ggctgggtgt gcctgttaac aagtccatcc ggtcctatct 86220 

taggtgacca ggaagccatg ctgtcatctc ctctacccac ttctgtgcag cagataactg 862 80 

atttagggat gctgggtatg ggtaacccaa atacagacag agaacagctg tctcactcta 86340 

tggcctctgt gctgagggat gtagacgaag ctagacctct gtctttccaa ccattggatc 86400 

aaacatctgt ggatctgatg tgacctccct tctccaccca agtcttcaca cctgtccagt 86460 

ctccttccta catctggcca cctttatagc ctttaggctc caccccttcc tgctcatacc 86520 

tgtcccctcc tttgagtttt tcagcattga aagagtagcc tctatcacct ccctctttgt 86580 

cgggcttggt ttccttctgc tgggattcag cacacagcag gtgcttgaaa atattggtca 86640 

ccagtttgga gttttgatgg aacatttgtc taagcagagt ctggccaagg tgatgtgggg 86700 

tttagaagga gagaaagaag actaagaggc atggtggagg ttgctgtcaa gtggcttaga 86760 

atttatatga aaagatagaa cacttgctta aaaccagttc aagatcagtg tggtggtaca 86820 

cttaggtggt ggttctagcc cttaggaggc agaggcagca ggctcagaaa ttcagtttga 86880 

gtctgactaa actacataga gtttgaggcc agtttgggat acatataaca ccctgtccta 86940 

caaaacaaaa acaatgataa aaaacagatg ctgtgtaaat aaaataaatg .tcatgtaaat 87000 

tagcatagtt ataaccatga tagctgaagg ccagtaagtc ccagattcca gagtagagac 87060 

tggttataac tgtgtctgag gaagtaagag caggtctccc tagctgtgga ctcaaaacag 87120 

cagatttggg tgttgagcct ctaactctct gccagaactt gtggggaaat ttggttgtat 87180 

tagtgcatgc ctttaaccct agcagtggag cttaagaact tgatgcagga agatcccaag 87240 
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tttgaggcca gccttggcta catagtgaat tctagacaca cctgttctat atagcaaaaa 873 00 

ggagagaggg acagagggag aaggaggaag taggaaggga aagacagaca gacaacagag 873 60 

acaaagagac agacgggtgg ggggagggag gaagagaatt aattagttgg gagactagag 87420 

tgaggttgag aggttgcccg aggccacaca gattattgtc aaagctggga ttgtagccat 87480 

atcttgggtc cttgctgtct ttattcttcc tgacctgcac cctacatgaa cagcttgtga 87540 

cagaccagag aatctgtggc catcaacaga gatgctgagc ttgggtcagg aagctttcct 87600 

ttaagacaaa tctctccttc caacctggac atttcccatc tgtgggccct cagagggagc 87660 

ctggcagagt gggtgcacag agaggaaaac cccagaaaga agctggaacc tccatcttca 8772 0 

gttcaggtac agtatacaaa taaatcctag tctcccatgc tccaggtcct ggcactgtag 87780 

atagccaagg gcagatcgag tttcttgcat gctacgccac actgaagacc aagtcccaga 87840 

ctaagttcta cttggagttc cactcaagct gcttagagag taagtgcctt gtgaactacc 87900 

ctttggggag gttctttcac ataactacct ttccattgat aactggttga tccctcaccc 87960 

agttcattcg ttttctctct ctctctctct ctctctctct ctctctctct ctctctctct 88020 

ctctctctct ctctcataag aatccatgga tggacccctc agtgggttca gcaaaggtgg 88080 

ggtacctatg gggctaccca gtggctatgt ggccctttcc ccacccctgt ggtttgttct 88140 

tctactgcct cagccctaga gatagaggca gccactggtg tccatgtccc tgtcccctag 88200 

gtcccccctg tccctgactt gtccatcaac attctctagg cagaatccat ttgagtttcc 88260 

accaagctgg gtttgagttt ccattaagct gggtttgagt ttccaccaag ctgggttctt 88320 

cttgctcagt gaaagccctg ggcaaagctc cagaatgctg aagatctaac agggcctaaa 883 80 

tagctgctac tgcctgcttg ctgttgccct ggccctcccc ttgccctttc catattgtgg 88440 

ctcttgtgtt gtggggctca ctgtgactgg gagcaaggta aagatagaca gcccatttca 88500 

ggtggcaacg ttgagcactc accatgctta tgaaagatcc tttctataga acatctccag 88560 

acatactgtc tgagactcct atgcaagaaa actatatgct ttctaagtct tcctgagatg 88620 

gttcaaagag tgatggggaa ggataggaac caaccttgca aacagaggca cagactgtct 88680 

gcttgtagag cccaggagag ctaccagcct tctaccccct cacacctttt tgcaccttcc 88740 

ttcaatgcat cccgaacctt cctgggatta tgccaccttc accacaacat cctgaagccc 88800 

cagctcagca gccagggtca aaacagttcc aactggaatg aggggctttg tgcgggtgtt 88860 

tttaataaac tatcgagaca tttgcataga tttctatgga aacagcatat taggaggctt 88920 

aaattagaat ttcaaaaggc tgctatcttg agtgtcgtgg gaagtgggac gagctgtcat 88980 

gcttcccagt cttcctgcct gcggtgtgat gacttctttg ttgcctggga tgttgcaaag 8904 0 

aggagtcaaa aaggaggggg aaatctcttt gatttttcct aaaaataatt agactgtgtt 89100 

ggcaaatgac catccttcaa acaaaaacag aatccccaga aagcctgact cctaattggt 89160 

ttggaggcct cagagacatg agaagggaaa gacagtcctc cgtggtcgag gaaggcgacc 89220 

ccttggagac ttttctgcct ggttctcact ctgtggtctg caaacccagc ttcttcttct 892 80 

ctaacaggtt ttgtcaagag tcaggaagga gagaatgaag agggaagtga aggagagctg 8934 0 

gtggtacggt ttggagagac tcttcccaag gtaatccagg aagaaaatgt gcctggggca 89400 

gagggctgca gcagtgcagg ttagtacgca gcactgcagg ctagtgtgca gctacagcag 89460 

tgcaggcgag tattcagctg cagcagtgca ggtgagtatt cagctgcagc agtgcaggcg 8952 0 

agtgtgcagc tacagcagtg caggcgagtg tgcagctgca gtagtgcagg ctagtgggca 89580 

gcagtgcagg ttggtggaca gctgcagcag tgcaagttgg tgtgcagcag tgcaggttag 89640 

tgggeagctg caacagtgca agcgaatatg cagctgcagc agtgcaggtt agtgtgcagc 89700 

tgcagctctg taggctagtg tacagctgta gctggtctgc ccagagtaca gctacagcag 89760 

gttcatccag actgtagcca tcccagacca aagccataga gaccagctta gacactgtag 89820 

tcaagggact tagggtatgc ttttcatcat ggatggatga cctcaatatt cctaagtggt 89880 

tttcagccct gctgtgccta ctcccctttg gttattacct acttagtgtt cgcagcactt 89940 

gagattgaat cgaggctcct gcaagcacta agcaagcacc caaccattga gctatgtccc 90000 

tgatttccct gaacactttt tagcagcagg aagttgaggg cagtgctgac atccaggagt 90060 

ctgtcactca ccaagccaca tctctagcaa accacagaat atttcaagcc caactggagg 9012 0 

tttgtgacac atttactcag acagggagca ggcctatggt gtcttctcta atgcagccat 90180 

agaccttgga ggggtatgag gagagggatg gtactgcaag tctcatttca gacctggctg 90240 

ctagaccttc ccttggcctc tgggatatgc aaagaatgga aggaagaatg gaggcaatgg 90300 

accagggacc tggaagctca actcaaccca gtttaggaaa tggggtgggg cagggaagaa 90360 

aactatctct cacttggctg agcaatgggg tccacctgct ctcttgggca tgaagagcca 90420 

agacctgagt ttgcttaaaa gaacatattc agttccgttt tactatgtat ttatttattt 90480 

tatgaatgag tgctctatct gcatgtacat ctgcatgcca gaagagggta ttaggttcca 90540 

ttacagatgg ttgtgagcca ccatgtggtt gctgggaatt gaatgcagga cctctggaag 90600 

agcagccagt gctcttaacc actaagcctc tctccagctc tcctgatgca tgcaaacacc 90660 
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aagactgctg acccccattt acctgtcctg acctgccctc acctgtggac tctaatctgg 90720 

tgtctcctct gcagctaaag cccattatct ctgaccccga gtacttactg gaccagcata 90780 

tcctgatcag cattaaatcc tctgacagtg acgagtccta tggtaagcat cccaggccag 90840 

ggacctcacc gtaccttagc ctcagcctct gaagtgaggg tgatggtggc ttagcctttc 90900 

aagaaagcct gtctgccacc acccattgtg ccctgatgct gccacttcct ctgtcccaga 90960 

gggacttttc ctagtgggtt accatccctg ctccttgtca cttctgggtg ccacagccag 91020 

taacctcctg cttagccttg ctgagtgagc tgcaggtacc atgtaattta atcttagcct 91080 

taagaggcag gttctgttcc ccagtgagta aataggtaaa ctgaggctca tagaggtcaa 91140 

gtgactcccc ccaggtcaca cagatagtcc tcaactgagc tgtggcttga gcctgagtct 912 00 

gcctgtgcca ccgtgtctct cgttagtccc tagccagctc ctcagaccca ggtgcagaac 91260 

attagcccgt ccctgctgcc aggtcagctt tccagtctgt ctccttttcc tttcagactt 91320 

ttccttgtat agagatttta gactgactta cactcttacc tgtttatctg attttgttta 91380 

tttcttgtac tgtttcaaac cttgggaagt aagctttccc tatagcaacc agattaacat 91440 

ttaataccat ttcttagatg gatttttcta ggatttaatt tttccctcag cattgatctg 91500 

tatagagttc tttttcctat gaggcttcag acatttgccc cagattgcta aggcatcgat 91560 

tgaaagggtt tgttggcaat ccctgcccgg ccacttgctt caaggagctg tctcccagtg 91620 

gtacagccat ctctctgagc atgctctata tccgtctaat gccaagatgt tcctattgat 91680 

gctctctctc ctttccccag gtgaaggctg cattgccctt cgcttggaga ccacagaggc 91740 

tcagcatcct atctacacgc ctctcaccca ccatggggag atgactggcc acttcagggg 91800 

agagattaag ctgcagacct cccagggcaa gatgagggag aagctctatg gtaggtcagc 91860 

cagcctcctc ctggctttcc cagaggccac tgcactaagg acatgttctt tctctcagca 91920 

aagctatcaa gtatctatga tctgtgcctt cagaatatag cgtggttaga aagtctctag 91980 

ggtagcgtgg tgatcctggc actcaggagg ctgaggcgag aagatcgtga gttcaaggct 92 04 0 

agtctgggtg acatactgag acgatgtctt taatgagaag aaaacaagtc atctcattta 92100 

tctctcacaa ccgccctgtg agtcaagtat ggttatctct acttggcaga tggagaaact 92160 

ggagctagat gactggctta atgttgcaaa ctcaggctaa gccaagttca tccagggcca 92220 

agcaggagct ggtagacata gtggttgctt gcctgtcatg aaaatagcaa gtgtaggaac 922 80 

gcacaacatg gggtgcaggg gctccttatg ccagggaggg ctctaagggc cactgcttct 92340 

gtgcatgtga ctctcctgtg ccccatgaga gatctttctt ttctctagca ggacagtggg 924 00 

ttgaacagat agtgggtcat tacctagcaa ccacattcca gaaataaagc ccggggttcc 92460 

tttaaactga tttatagggg ttcccataaa cagataagct ctgcatggga agtactctta 92520 

tagggtaacc taatagctgt tctaagtttg ttctagtctg acaagaaatg tctgttttat 92580 

ccatggctaa gctaacaagg tatttcttga aataacaact gggtcataag tatctgttca 92 64 0 

tctccttctg gacccatggc aggcatcatg gcaggcatca tggcaggcaa cagttgccaa 92700 

gcgtaaggat tttgaagcct gaaaatcttg cttgatatac ctctcccagt aaccactatt 92760 

tactgtccag ctttggcctg caaagagata ccacattgcc atttctaacc catgccgtgg 92820 

ctaggatctc acatgttttc tttcctaaat gttctcttcc agactttgtg aagacagagc 92880 

gggatgaatc cagtggaatg aaatgcttga agaacctcac cagccatgac cctatgaggc 92940 

aatgggagcc ttctggcagg tagacgaagc ttgctaagac ttattacagc tagctgggct 93000 

gtgtgaacca gagtccagga gagggtaaag tggagttcag gaaagccaag gtcagaggag 93060 

aagaaatgtt gtggcccagg ggacccactc tcctacctca gtttcacatc ctgagttcaa 93120 

atcctcatct ctgaaagatc aaagccttgg agaacatttc catgtaggaa gggctcagct 93180 

caactatcct ttaatgaaca cctgctatat tcaggaagac acagtgtaaa tacagtatag 9324 0 

tccatgacct acagacttaa caccagccag tggaaggatg tgcagtgcag gaagaaggag 933 00 

ctatagatga ctgaacaatt ccagggcaaa agatacttct cctgcattca gggaggccag 93360 

gaacataagt ggtcaaaagt caccacccat gacttcctag cttcatttct gttgctgtgg 9342 0 

caaatgccct gattaaaagg tagcatgggg agggaaggga tcatttgact tcctcaagcc 93480 

cacattacag tctagtgttg tggagaagcc agggcaggaa cctgaagtat cacatcaaca 9354 0 

gtcaagagca cagagaataa acacatgcac ccttgcttgt tctcagttag ttttcttcac 93600 

tcttacacag ttcaaggccc agccaaggaa atggtgccac ccacaatgga ctgagtcttc 93660 

ctacattaat taacaatgaa gacaagtccc tatagacatg cccacaagcc aaacttatct 93720 

agtcaattct tcataaagac tctcttccct ggtgatccta gattgtgtca agagggcatt 93780 

tcaaaccaac caacacagag gggtgtagcg gggcatgcct ttacttccag aactcaggac 93 840 

gtagacgggc tgatcttgtg agttcacggc cagcgttgtc tacacagtga attccaggac 93900 

attcagagct acataatgag accctatctc aagaaacaaa ataaaacaaa acacagaaag 93 960 

actaaccaac acagtgacta acaggaaaga cggattggag agtaacctgg gcattccagt 9402 0 

ctatgttgtt tcctgatatc agaaacaccc ctgtttcctg atgtgctcgt gttagtcctc 94080 
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ctgagatgga gcccattgtg gaacaaggtt tggtgtgtga ctacaaaggt tgcctcacct 9414 0 

ggctctggac tgtctcctcc agttttaaca cagctgccct ccttaccacc tctagcctca 94200 

ccctctgtgc cgcacactca ccctgctggt ctgatgcctt ccgtgggatg tgcaaaccaa 94260 

catgtcccac gtgactctcc cacagcccca ggcccaccag gttctcattc tgctctcccc 94320 

atgtgctgac cttgtcttca gtgaacccca caacagggca gcctctatcc ctgaccacag 94380 

ggccttagca tggacctttg cctccacaca gagcatcctc ctctcttttt ctctcttctg 94440 

tactccatat aaattcaatc cgcagacagc cttgagtcaa tcttctccag ccatgttcta 94500 

cctggtgggt ctccactctc ctacttgtct caacttgatc tcttatttct ctttgccata 94560 

ttatacacgt accagaatga gtaaatgaat tgttgtgccc aagaaataaa ggtactgaga 94620 

ggctgagaag ctgaggaaca ttttcctcag gctggtgcca gccaagaagg gaagaggctt 94680 

caggtgtgcc tccacatctt cctacagggt ccctgcatgt ggtgtctcca gcctcaatga 94740 

gatgatcaat ccaaactaca ttggtatggg gccttttgga cagcccctgc atgggaaatc 94800 

aaccctgtcc ccagatcagc aactcacagc ttggagttat gaccagctac ccaaagactc 94860 

ctccctgggg cctgggaggg gggagggtcc tccaacccct ccctcccaac cacctctgtc 94920 

gccaaagaag ttttcatctt ccacagccaa ccgaggtccc tgccccaggg tgcaagaggc 94 980 

aaggtgagtg tcctctgaat tgtgtgtgtg tgtctgtctg agtgtctgtg tttctgctca 95040 

aaagcatccc ttgggggctc ccaagggtga cggcctgaag agggcagagt tgtagtaggt 95100 

tctgcccact actttggctt ctgcctgtcc aaaacagttg gtgcaacttg attttaaagg 95160 

ggactgggtt gggacttgcc aagaaagcca tcttcttata aaaactgcat ttactcacaa 95220 

agacatttct caaaccaatg acaatctatg ccatcctccc cttcctggag ttctcagtgg 95280 

gaaagaggtg aggatttctg aatagacacc attatctacc ccaaatctct ctctttcctt 95340 

aaaaacaagc attcatgttc tcatatataa ctatctgggg gctggagaga tgactcagtg 954 00 

gttgagagca ctgactgctc tttcggaggt cctgagttca attcccagca accacatggt 95460 

ggctcacaac catctgtagt gagatctgat gccctcttct ggtgtgtctg agaagaatga 95520 

cagtgcactc acatacataa gtaagaaaat aactatctgg accctcctga tggaaacata 95580 

tattggaaac taggaaccca aatgaaggca gagctgtcat cctacagagg gagccggcca 95640 

gaacaggttt aggagcaagg accacacagc ccagagatga agtcctaagc agatatggga 95700 

aacattaggt gggaatccca ttcctacagg atgatagatg gccaagtgac atccagacct 95760 

agttaacagt gacacagatg tgtctcctcc ccagctgcct tgagcatttt gtggtgacac 95820 

cggctctcac attcagcctc ttttctgagt gctgtttctt ttcttccctt ttcagaactt 95880 

cagttcctta gcagagctaa ctcatagtaa tcagggaccc gtgctggagc atcacccaca 95940 

gtgcggttct cctccagacc ctctaagtca aatgcctata ccaggcctgc ctggggattc 96000 

ctgtggagag gcactgacag tcctgtgcct tagctgttag ctgaactagc tgaaaggtgg 96060 

gagggcaggt cccttagcca gactgaagtc tacttcctag gagcaaggga gaaatcgcct 96120 

tggcatccct ccccggaaat gaggatcaag gtagccatcc agaaggtact gaggtacttg 96180 

tttgacaaag gcagtctctt tcgagacccc atgaagcaga gctagagaca gccacaaaga 9624 0 

aagcacaagt tcatggactc agaggttccc agagtggaag tcactgtgtg cttcacacgg 96300 

tgaacagagc ttggtggaaa catgtcctct ccagccccag gtgacagcat aggggtagac 96360 

agctatcagg gagcgagcta caggactagg tggcaatagc cacccatccc aggttcccca 9642 0 

aggctgcccc aactttagca tttaaaagtc ccatctcctg gaaaacactg cagtcctggt 96480 

aagcttggac caccaccatg tgaggtcaag gctgtgtgcc aagaaaggag actctgctca 9654 0 

ggccctagcc ggcatccatg gtctcctgca caagaactga ctctgccccc tggataagag 96600 

cttggtttcc tgttgcctat actcaaactt tttttttttt ttgactaaaa ctcgtgagtt 96660 

taatctgttt ttcctaacct ccattaggat agcaggtgta ccctgaacac tgctggaagg 96720 

atcactttct agcttctatc tcattggctc catcctctgg cctctctatc tctatagctg 96780 

agtggatagc cagatgccac acacatgcac aggcacgcac acgcacgcac atgggggggt 96840 

ggggagatgg ggggggagtt gcttgcttct gtctaccaca atgccctcag aggccagaag 96900 

aggacattgg attcccaaga actgcagtta catacagttg tgagttacta tgtagatgcc 96960 

aggagtcgaa cctggtaact ctggaagaag agcaaatact cttaatcaca gaaccaactc 9702 0 

ttcagcccgg catggtgttg catgccttta atcccagcac tcagcagaca gtgaggcggg 97080 

cggatctgtc agttctatat caaccaggga tacgtagtga gccctggtgt ttaaaaagaa 9714 0 

aaattctcaa ggcagaatcc atgtaaaaat tattccctgg aaaataagta attcggaggc 97200 

atttctgtgg tgcatgttat atgatcacta ctgtagttag ttccagagca ttcttatcac 97260 

ccctaaatga aacccagagg catgaagcaa ctactcccca tgtcctgact tctgaacata 97320 

tttcctcccc atctcctctc cttttcccca tggttccaga cctggggatc tgggaaaggt 973 80 

ggaagctctg ctccaggagg acctgctgct gacgaagccc gagatgtttg agaacccact 97440 

gtatggatcc gtgagttcct tccctaagct ggtgcccagg aaagagcagg agtctcccaa 97500 
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gatgctgcgg aaggagcccc cgccctgtcc agacccagga atctcatcac ccagcatcgt 97560 

gctccccaaa gcccaagagg tggagagtgt caaggggaca agcaaacagg cccctgtgcc 9762 0 

tgtccttggc cccacacccc ggatccgctc ctttacctgt tcttcttctg ctgagggcag 97680 

aatgaccagt ggggacaaga gccaagggaa gcccaaggcc tcagccagtt cccaagcccc 97740 

agtgccagtc aagaggcctg tcaagccttc caggtcagaa atgagccagc agacaacacc 97800 

catcccagct ccacggccac ccctgccagt caagagtcct gctgtcctgc agctgcaaca 97860 

ttccaaaggc agagactacc gtgacaacac agaactcccc caccatggca agcaccgcca 97 92 0 

agaggagggg ctgcttggca ggactgccat gcaggtatgt tgagctgtat gtatatgggt 97980 

gtatatgcat atgtgtgcac gcatgcatct gtgtgtgtgt gtatgtgtat gtgtatcctg 98040 

ggatgtgctc tgtgaagaag ggctaaacct ggggtttgta cctggcatct gttcctgttc 98100 

ctgactggtc ctacagggca tctgcagtga ttcttgagag cgacagagag gagctgttga 98160 

cacctgtaga gtaagggagt ctaagtcatt tatattttag tcttatcatc cttaattagg 98220 

gtatgcactt aaaacattcc tggtgtttta gagacctcag gcagaatatc cccttcctat 98280 

agaatagatt tattgtagag caaaactaac aatgacttaa atacacacac acacagagtt 98340 

gagactggac atgatagtac accgctataa tcccagcaca ctgtacatct agagcaagag 98400 

gattggaagt tcaaggcaag cctgggctaa atggtgagac agacaacagc agcagtgaga 98460 

gagctaaggt tatatagctc agtgggagag tgcacaggaa gcctccagaa agcccttagg 98520 

tgaatctcca gtatcacaag gaacagagtc tagaactaga ctgctggcca ttccattagc 98580 

atacttattg cagcattgag ggtcagactt ggggccttac actaggcaag cactgtacca 98640 

cagtcctgtg ttggatttaa atcccaagtc tccactgcac ggctctatgc ctttttaaac 98700 

tgtcaaaaag aaactactag tgccagccat tgtgtagatt tatggggaga aaaagattat 98760 

gccatatact gttattgttt gtgatcattt ttacagttat cactcatttt gaaactatat 98820 

acaacatata tctatctcat aggtactgtg tttgtaatat atgcatgaat taacatataa 98880 

ggcacttagg acagtgtggc tctttcacat taattagcct gacctaactt acagcactag 98940 

gggtcaaatc tggcaaggcc ctgatagccc tttctgttct catcttacag tgagctgctg 99000 

gtgatcggag cctggaggaa cagcacaaag cagacctgcg cctctctcag gatgcctctc 99060 

tcaggatgcc tcttggagga cctcctgcta gctcttcttg cctagcttca agtcccaggc 99120 

tgtgtatttt ttttcaggaa acggcctcac ttctctgtgg tccaagaagt gtgctgctgg 99180 

ctgccacact gtgcggcaga tgctaaagct ggatgacaaa cgcacgccat acagacagca 9924 0 

gacagcggca ctgggtctca gaacttggat tcctgggcct tcttccagtc gccgttttaa 99300 

agaaaggaac taacggagct gctcatccga gggtgaagat ataaataata atattattaa 99360 

taataataac agtcaggtgc catgtgctgt gttaagtgct ttatgaacat ttgtcgggct 99420 

ggcctccagt gctgaggtgc cagtcagcct gaaccctatg cccaggccca ctaatcccaa 99480 

atggtgggtc ctgagatgtt tttaaaaagc attaaagaaa accatcggtc tcttagagct 9954 0 

aaccggccgg gctctactgc agggacccga acagtctgca tggctaagtg gcacaaggag 99600 

cctggccctg tccagcttca gagatccaag ctgctttttg ctggggttct gtcacaggcc 99660 

tgatcctctt ggtttttatg gggtttcaag tctgccagag tcagaaatca gctctaactc 99720 

gccagtgaag agatctggcc ttaacttaag ccagccacgt caggcccctg ctgagcctat 99780 

ggaccaataa atactccccg tgccactgga ggtgggcagc tatcaccata ccctgagttg 9984 0 

ggccaagccc accccacccc taccctgcaa catttctgat gtactgagga agagtctcca 99900 

ccatagtccc caagggctga gttctccagc ctgctatcag ggaaggtgag cattggtccc 99960 

aggctctcaa aatagtgcag cctcttcttc ccaagctctg gggtgcaccc tgtgtccttg 100020 

gttaccagga gactagggtt gtgatatctt ttcttgtctt gctttttgat atatcaggat 100080 

taatgtagga aaccagacct agattattca ggagagtagg tatatcccct gtgtttccca 100140 

<210> 2 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 2 

tactcctcag caagagtagc tgg 23 
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<210> 3 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer , _ . 

<400> 3 

gctgaacttg tggccgttta cgt 23 



<210> 4 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 4 

cttctatagc cttcccaagc c 21 



<210> 5 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 5 

ctcgtaggtc tcacaggaag 20 



<210> 6 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 6 

cctgctggat tacattaaag cactg 25 



<210> 7 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial 
Primer 

<400> 7 

gtcaagggca tatccaacaa caaac 

<210> 8 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Primer 

<400> 8 

ggcgttctct ttggaaaggt gttc 



<210> 9 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Primer 

<400> 9 

ctcgaaccac atccttctct 



<210> 10 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Primer 

<400> 10 

ttgctgcacg agggctcaga ate 



<210> 11 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
Primer 



Sequence: Synthetic 



25 



Sequence: Synthetic 



24 



Sequence: Synthetic 



20 



Sequence: Synthetic 



23 



Sequence: Synthetic 
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<400> 11 

tccgattctc atgctctggc ttg 

<210> 12 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 12 

cagccctgtc tttgccacgt ttg 

<210> 13 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 13 

tccactggat tcatcccgct ctg 

<210> 14 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 14 

cttcctcttg caacagagaa ccc 

<210> 15 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Primer 

<400> 15 

actcaacgtc cactttgaga tgc 
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